JP3364820B2

JP3364820B2 - Synthetic voice output method and apparatus

Info

Publication number: JP3364820B2
Application number: JP11592595A
Authority: JP
Inventors: 久子阿部; 芳史大山; 浩司松岡
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 1995-05-15
Filing date: 1995-05-15
Publication date: 2003-01-08
Anticipated expiration: 2018-01-08
Also published as: JPH08314901A

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、合成音声出力方法及び
装置に係り、特に、漢字かな混じりの日本語文書を合成
音声で読み上げる合成音声読み上げ処理において、時詞
を含む複合語を高精度にアクセント句に分割して、不自
然さを除去した音声出力が可能な合成音声出力方法及び
装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a synthetic speech output method and apparatus, and more particularly, in a synthetic speech reading process of reading a Japanese document containing kanji and kana by synthetic speech, a compound word including a time word can be highly accurately defined. The present invention relates to a synthetic speech output method and apparatus capable of dividing a sentence into accent phrases and outputting a speech with unnaturalness removed.

【０００２】[0002]

【従来の技術】新聞記事や電子メール等の電子化された
漢字かな混じりの日本語文章を合成音声で読み上げる際
に、より自然な音声出力を行うために、少なくとも漢字
かな混じり文に読みを付し、文をアクセント句に分割し
て、アクセント句のアクセント型を決定し、適度な間隔
でアクセント句間にポーズを設定する必要がある。2. Description of the Related Art When reading a computerized Japanese sentence containing kanji and kana such as newspaper articles and e-mails with synthetic voice, at least the kanji and kana mixture sentence is read in order to output a more natural voice. Then, it is necessary to divide the sentence into accent phrases, determine the accent type of the accent phrase, and set the pose between the accent phrases at appropriate intervals.

【０００３】現在このようなテキスト音声合成方法や装
置は実用化されているが、人間の発話音声と比較すると
かなり不自然な音声となる。この不自然さは、主に韻律
情報の設定精度の問題である。韻律情報には、アクセン
ト句情報、ポーズ情報等がある。次にそれらの特徴を説
明する。At present, such text-to-speech synthesis method and device have been put into practical use, but they are considerably unnatural compared to human speech. This unnaturalness is mainly a problem of setting accuracy of prosody information. The prosody information includes accent phrase information and pose information. Next, those features will be described.

【０００４】アクセント句とは、自然に話した時に一つ
のまとまりとして発声する単位で、ピッチパターン（基
本周波数Ｆ₀の時間パターン）上に表現される。日本語
東京方言では、アクセント各（Ｆ₀が急激に下がり始め
るモーラ）を最大一つもつ単位と定義され、ほぼ文節に
対応する。しかし、複数の文節が１アクセント句になる
場合も、１文節が複数のアクセント句になる場合もあ
る。An accent phrase is a unit that is uttered as one unit when naturally spoken, and is expressed on a pitch pattern (time pattern of fundamental frequency F ₀ ). In the Japanese Tokyo dialect, it is defined as a unit having at most one accent each (Mora at which F ₀ begins to fall sharply), and corresponds to almost a phrase. However, a plurality of clauses may be one accent phrase, and one clause may be a plurality of accent phrases.

【０００５】複数の単語によりアクセント句が構成され
る場合には、単語のアクセント型の消失や移動等が起こ
るので、アクセント句としてアクセント型を設定する必
要がある。アクセント句情報は各アクセント句の範囲と
アクセント型の情報を持つ。ポーズは、統語・意味的要
因と発声上の制約（息継ぎ）により、適度な間隔と長さ
でアクセント句とアクセント句の間に設定される。ポー
ズ情報は、各アクセント句間に対するポーズの有無、及
びポーズ有りの場合はポーズ長に関する情報を持つ。When an accent phrase is composed of a plurality of words, the accent type of the word is lost or moved, so that it is necessary to set the accent type as the accent phrase. The accent phrase information has the range of each accent phrase and the accent type information. Poses are set between accent phrases with proper intervals and lengths due to syntactic / semantic factors and vocal restrictions (breathing). The pose information has information about the presence / absence of a pose between each accent phrase and, if there is a pose, information about the pose length.

【０００６】従来から、形態素解析を利用して文をアク
セント句に分割する方法が用いられている。これは、基
本的には、「１自立語＋連続する（複数の）付属語」を
１アクセント句とする（用言の語幹＋活用形は１つの自
立語と見做す）。例えば、「今日の東京の天気は晴れでしょう。」という文は、「今日の／東京の／天気は／晴れでしょう。」（／はアクセント句境界）とアクセント句に分割され
る。Conventionally, a method of dividing a sentence into accent phrases using morphological analysis has been used. Basically, "1 independent word + consecutive (plural) adjuncts" is defined as 1 accent phrase (stem of stem + conjugation is regarded as 1 independent word). For example, the sentence “Today's weather in Tokyo will be fine.” Is divided into “Today / Tokyo / weather / is fine.” (/ Is the accent phrase boundary) and accent phrases.

【０００７】しかし、複合名詞においては、連続する２
〜４単語程度の自立語がまとまって１アクセント句とな
る場合と、途中で分割され、複数のアクセント句となる
場合がある。そこで、複合名詞のアクセント句分割方法
を考える必要がある。ここで、複合名詞を連続する体言
（名詞、接辞）の単語列とする。厳密には、連続する体
言の単語列の間に文節境界があり、１つの複合名詞とな
らない場合がある。例えば、「毎年海外に行く。」という文における『毎年海外』は複合名詞ではなく、
『毎年』と『海外』の間には文節境界がある（但し、
「毎年恒例の行事」という場合には、『毎年恒例』は１
つの複合名詞となる）。しかし、形態素解析では文節の
認定を行わないので、「毎年海外」のように見かけ上複
合名詞となっているもの（以下、このような表現の単語
を見かけ複合名詞と呼ぶ）と、「毎年恒例」のように実
際に複合名詞であるものの区別をつけることができな
い。そこで、形態素解析を利用したアクセント句分割方
法では、どちらの場合も複合名詞として扱わざるを得な
い。However, in compound nouns, two consecutive
There are cases in which independent words of about 4 words are grouped together into one accent phrase, and cases in which the word is divided halfway into a plurality of accent phrases. Therefore, it is necessary to consider the accent phrase division method of compound nouns. Here, the compound noun is a word string of continuous body words (noun, affix). Strictly speaking, there is a bunsetsu boundary between consecutive word strings of nouns, which may not be one compound noun. For example, "every year abroad" in the sentence "every year abroad" is not a compound noun,
There is a bunsetsu boundary between "every year" and "overseas" (however,
When referring to "annual event", "annual" is 1
Be one compound noun). However, since morphological analysis does not recognize bunsetsu, words that appear to be compound nouns such as "every year overseas" (hereinafter, words with such expressions are called compound nouns) and "annual It is not possible to distinguish what is actually a compound noun like ". Therefore, in both cases, the accent phrase segmentation method using morphological analysis must be treated as a compound noun.

【０００８】従来の形態素解析を利用したアクセント句
分割方法として代表的な２つの文献を示す。第１の文献
として「野村：単語の分類を用いた複合語のアクセント
句分割とアクセント付与、電子情報通信学会論文誌 Vo
l.J75-D-II, No.9, pp.1479-1488 (1992-9)」がある。
この方法は、自立語をアクセント分割の観点から４種類
に分割し、その分類と品詞を利用してアクセント句分割
を行っている。[0008] Two typical documents are shown as a conventional accent phrase segmentation method using morphological analysis. As the first document, "Nomura: Accent phrase segmentation and accent addition of compound words using word classification, IEICE Transactions Vo
l.J75-D-II, No.9, pp.1479-1488 (1992-9) ”.
In this method, the independent word is divided into four types from the viewpoint of accent division, and the accent phrase division is performed using the classification and the part of speech.

【０００９】また、第２の文献として「宮崎：単語間の
意味的結合関係を用いた複合語アクセント句の自動抽出
法、電子情報通信学会論文誌 Vol.J68-D, No11, pp.25-
32 (1985-1) 」がある。この方法は、品詞と複合名詞内
の係り受け解析に基づき、アクセント句分割を行ってい
る。As a second document, "Miyazaki: Automatic extraction method of compound word accent phrase using semantic connection between words, IEICE Transactions Vol.J68-D, No11, pp.25-"
32 (1985-1) ". In this method, accent phrase segmentation is performed based on the dependency analysis within the part of speech and compound nouns.

【００１０】これらの方法で次の（１）、（２）のよう
な場合はうまくアクセント句分割を行うことができる
（／は正しいアクセント句境界を示し、「」は１単語を
表す。今後の例では、この表記を用いる）。（１）“「化学」「工場」” （２）“「小型」／「処理」「装置」”In these cases, accent phrase division can be successfully performed in the following cases (1) and (2) (/ indicates a correct accent phrase boundary, and "" indicates one word. The example uses this notation). (1) "Chemical""Factory" (2) "Small" / "Process""Device""

【００１１】[0011]

【発明が解決しようとする課題】しかしながら、上記の
従来の方法では、上記の例（１）、（２）では問題がな
いが、時詞（時を表し副詞的に使われる場合もある名
詞）を含む以下の（３）、（４）のような複合名詞の場
合には問題がある。但し（３）は、例えば、『「毎月給料」と同程度の利益を上げられた』という文の「」部分に該当するような見かけ複合名詞で
ある。However, in the above-mentioned conventional method, although there is no problem in the above-mentioned examples (1) and (2), verbs (nouns representing times and sometimes used as adverbs) There is a problem in the case of compound nouns such as (3) and (4) below that include. However, (3) is, for example, an apparent compound noun that corresponds to the "" part of the sentence "I made a profit similar to" Monthly salary "."

【００１２】（３）「毎月」／「給料」（４）「半年」「コース」上記の第１の文献では、時詞に対して特別なルールを適
用していない。そこで、例（４）のように一般名詞と同
じように用いられる時詞の場合にはうまくアクセント句
分割を行うことができる。しかし、例（３）のように見
かけ複合名詞に対しても同様に扱うため、「毎月給料」
が１アクセント句となってしまう。(3) "Monthly" / "Salary" (4) "Semi-annual""Course" In the above-mentioned first document, no special rule is applied to the toki. Therefore, in the case of a verb used in the same way as a general noun as in the example (4), accent phrase division can be performed well. However, as in example (3), since it treats apparent compound nouns in the same way, "monthly salary"
Becomes one accent phrase.

【００１３】これに対して上記の第２の文献では、「時
詞（文献においては、「副詞的名詞（「今年」、
「朝」、「春」等時を表し、副詞的に使われる名詞）」
と表現している）は、独立性が強いため、独立アクセン
ト句（自立語数が１であるアクセント句）となる。」と
している。このため、例（３）の場合にはうまくアクセ
ント句分割を行うことができる。しかし、例（４）のよ
うに時詞とその直後単語の結び付きが強い場合にも同様
にアクセント句分割を行ってしまうという問題がある。On the other hand, in the above-mentioned second document, "toki (in the document," adverbal noun ("this year",
"A noun used as an adverb to represent times such as" morning "and" spring ""
Has a high degree of independence, and is therefore an independent accent phrase (an accent phrase having a number of independent words of 1). ". Therefore, in the case of the example (3), accent phrase division can be performed well. However, there is a problem in that the accent phrase division is also performed when the word and the word immediately after that are strongly connected as in the example (4).

【００１４】このように、従来のアクセント句分割方法
では、例（３）、（４）のように時詞を含む複合名詞が
存在した場合にうまくアクセント句分割を行うことがで
きないという問題がある。本発明は、上記の点に鑑みな
されたもので、時詞が含まれる複合名詞においてより高
精度にアクセント句分割することでより自然な合成音声
出力が可能な合成音声出力方法及び装置を提供すること
を目的とする。As described above, the conventional accent phrase segmentation method has a problem that the accent phrase segmentation cannot be properly performed when there is a compound noun including a punctuation as in Examples (3) and (4). . The present invention has been made in view of the above points, and provides a synthetic speech output method and device capable of more natural synthetic speech output by more accurately dividing an accent phrase in a compound noun including a verb. The purpose is to

【００１５】[0015]

【課題を解決するあめの手段】図１は、本発明の原理を
説明するための図である。本発明は、日本語漢字かな混
じり文の日本語文書を合成音声で読み上げる合成音声出
力方法において、入力された日本語かな混じり文から各
単語を認定する単語認定処理（ステップ１）と、単語認
定処理で認定された単語に対して形態素解析を行い、各
単語毎の情報を得る単語解析処理と、単語解析処理で解
析された各単語と各単語毎の情報を使用して日本語かな
混じり文に読み仮名を付与する読み仮名付与処理（ステ
ップ２）と、日本語かな混じり文を、時詞が連体的に用
いられるか否かの情報、複合名詞内の係り受け情報、該
複合名詞を構成する単語の品詞及び該複合語内の単語数
を用いてアクセント句に分割するアクセント句分割処理
（ステップ３）と、分割されたアクセント句毎のアクセ
ント型を設定するアクセント型設定処理（ステップ４）
と、分割されたアクセント句の間にポーズ情報を設定す
るポーズ設定処理（ステップ５）と、読み仮名付与処理
で付与された読み仮名、アクセント型、及びポーズ情報
を用いて、日本語かな混じり文に対応する韻律情報付き
合成音声を生成する音声合成処理（ステップ６）からな
り、アクセント句分割処理（ステップ３）は、与えられ
た複合単語内にある時詞が、連体詞的に用いられた場合
には、該時詞の１つ先の単語の品詞の種類によって生成
する時詞単語パターンを変更する単語パターン識別変更
処理と、時詞の１つ先の単語が、単語パターン識別変更
処理によって、サ変動詞型名詞または、係り受け関係の
ある一般名詞と判断され、かつ、時詞の１つ先の単語が
複合単語の末尾にあるとき、または、時詞の２つ先の単
語が接尾辞、付属語または、記号である場合には、該時
詞と該時詞の１つ先の単語で時詞単語パターンを構成し
ていると判断する末尾判断処理を行う。[Means for Solving the Problems] FIG. 1 is a diagram for explaining the principle of the present invention. The present invention is a synthetic speech output that reads out a Japanese document containing a mixture of Japanese kanji and kana with synthetic speech.
In the power method, each of the input Japanese kana mixed sentences
Word recognition processing (step 1) to certify words and word recognition
Morphological analysis is performed on words that have been certified by regular processing, and each
The word analysis process that obtains information for each word and the solution
Is it Japanese using each analyzed word and information for each word?
Kana grant process to grant the pseudonym reading the mixed statement use (step 2), the Japanese kana, when lyrics is the attributive manner
Information about whether or not you can enter, dependency information in compound nouns,
Part of speech of words forming compound noun and number of words in the compound word
Accent phrase division processing for dividing the accent phrase using the (Step 3), divided accessed accent phrase each
Accent type setting processing for setting the accent type (step 4)
And pose information is set between the divided accent phrases.
Pose setting process (step 5) and phonetic transcription process
Kana, accent type, and pose information given in
With prosody information corresponding to Japanese kana mixed sentences
It consists of a voice synthesis process (step 6) for generating a synthetic voice.
The accent phrase division process (step 3) is given.
When a compound word in a compound word is used as an adjunct
Is generated according to the type of part of speech of the word immediately preceding the time verb.
Change the word pattern identification Change the word pattern identification
Processing and word pattern identification change, word one word ahead
Depending on the processing, the sa verb type noun or the dependency relation
A word that is judged to be a general noun and is one word ahead of the time
When it is at the end of a compound word, or just after the verb
If the word is a suffix, adjunct, or symbol, then
The word word pattern is composed of the word and the word immediately preceding the word.
The trailing end judgment process is performed to judge that there is .

【００１６】[0016]

【００１７】図２は、本発明の原理構成図である。本発
明は、日本語漢字かな混じり文の日本語文書を合成音声
で読み上げる合成音声出力装置であって、入力された日
本語かな混じり文から各単語を認定し、認定された単語
に対して形態素解析を行い、各単語毎の情報を得る単語
解析手段１００と、単語解析手段１００で解析された各
単語と各単語毎の情報を使用して日本語かな混じり文に
読み仮名を付与する読み仮名付与手段２１０と、日本語
かな混じり文を、時詞が連体的に用いられるか否かの情
報、複合名詞内の係り受け情報、該複合名詞を構成する
単語の品詞及び該複合語内の単語数を用いてアクセント
句に分割するアクセント句分割手段２２０と、分割され
たアクセント句毎のアクセント型を設定するアクセント
型設定手段２３０と、分割されたアクセント句の間にポ
ーズ情報を設定するポーズ設定手段２４０と、読み仮名
型付与手段２１０で付与された読み仮名、アクセント
型、及びポーズ情報を用いて、日本語かな混じり文に対
応する韻律情報付き合成音声を生成する音声合成手段３
００とを有し、アクセント句分割手段２２０は、与えら
れた複合単語内にある時詞が、連体詞的に用いられた場
合には、該時詞の１つ先の単語の品詞の種類によって生
成する時詞単語パターンを変更する単語パターン識別変
更手段と、時詞の１つ先の単語が、単語パターン識別変
更手段によって、サ変動詞型名詞または、係り受け関係
のある一般名詞と判断され、かつ、時詞の１つ先の単語
が複合単語の末尾にあるとき、または、時詞の２つ先の
単語が接尾辞、付属語または、記号である場合には、該
時詞と該時詞の１つ先の単語で時詞単語パターンを構成
していると判断する末尾判断手段を有する。FIG. 2 is a block diagram showing the principle of the present invention. The present invention synthesizes a Japanese document of Japanese kanji and kana mixed sentences into a synthesized voice.
Synthesis an audio output device, certify each word from the input Japanese kana performs morphological analysis on the words that have been approved, the word analysis unit 100 to obtain the information for each word read aloud, Using the words analyzed by the word analysis means 100 and the information for each word, the reading kana addition means 210 for giving the reading kana to the Japanese kana mixed sentence and the Japanese kana mixed sentence, the toki are connected Information about whether or not it is used in the compound noun, dependency information in the compound noun, the part of speech of the words forming the compound noun, and the number of words in the compound word. the accent type setting means 230 for setting the accent phrase for each accent type, a pause setting means 240 for setting a pause information between divided accent phrase, kana-type application means 210 Granted kana, accent type, and using the pose information, voice synthesis section 3 for generating prosody information with synthesized speech corresponding to Japanese kana
00 and have a, accent phrase division means 220, given et al.
When a verb in a compound word is used as an adjunct
In this case, depending on the part-of-speech type of the word immediately preceding the time verb,
Change word pattern identification change
And the word immediately preceding the verb is the word pattern identification variable.
Depending on the changing means, the sa noun type noun or the dependency relation
A word that is judged to be a common noun and is one word ahead of the verb
Is at the end of a compound word, or two times after the verb
If the word is a suffix, an adjunct or a symbol,
A time word word pattern is composed of a time word and the word immediately preceding the time word.
It has a tail judging means for judging that it is performing .

【００１８】[0018]

【００１９】[0019]

【作用】本発明は、時詞が連体詞的に用いられるかとい
う情報と、上記の第２文献で用いる複合名詞内の係り受
け解析の情報と、複合名詞を構成する単語の品詞等の情
報及び複合名詞内でまだアクセント句分割が行われてい
ない構成単語の数を利用して、時詞を含む複合名詞のア
クセント句の分割を行う。即ち、複合名詞内で時詞とそ
の直後の単語の意味的な結び付きが強い場合に、時詞と
その直後の単語が１つのアクセント句となることに着目
し、時詞が連体詞的に使われるかという情報と時詞の直
後単語の品詞と時詞に後続する名詞の数、及び時詞から
その直後単語への複合語内係り受け情報を用いることで
アクセント句を判定する。これにより、時詞を含む複合
名詞のアクセント句分割を高精度に行うことが可能とな
る。According to the present invention, information on whether a toki is used as an adjunct, information on dependency analysis in a compound noun used in the second document, information on a part of speech of a word forming a compound noun, and By using the number of constituent words that have not been divided into accent phrases in compound nouns, accent phrases of compound nouns including time words are divided. In other words, when there is a strong semantic connection between a verb and the word that immediately follows it in a compound noun, paying attention to the fact that the verb and the word that immediately follows become one accent phrase, and the verb is used as a continuum. The accent phrase is determined by using the information such as "ka", the part of speech of the word immediately after the verb, the number of nouns following the verb, and the dependency information in the compound word from the verb to the word immediately after the verb. As a result, it becomes possible to accurately perform accent phrase division of a compound noun including a verb.

【００２０】従って、このようなアクセント句分割を行
うことで、合成音声出力装置は人間の発声音声に近い音
声を出力することが可能となる。Therefore, by performing such accent phrase division, the synthesized voice output device can output a voice close to a human voice.

【００２１】[0021]

【実施例】以下、図面と共に、本発明の実施例を詳細に
説明する。以下の説明では、（部分）複合名詞をＷ１Ｗ
２Ｗ３…Ｗａ（Ｗ１：複合名詞の第１単語、Ｗ２：複合
名詞の第２単語、Ｗ３：複合名詞の第３単語、Ｗａ：複
合名詞の第ａ単語（末尾単語））と表す。そして、時詞
をＷｔ（ｔは、１〜ａ−１までの任意の値）、その直後
単語をＷｔ＋１、Ｗｔ＋１の直後単語をＷｔ＋２と表
す。Embodiments of the present invention will now be described in detail with reference to the drawings. In the following description, the (partial) compound noun is W1W.
2W3 ... Wa (W1: first word of compound noun, W2: second word of compound noun, W3: third word of compound noun, Wa: a-th word (end word) of compound noun). The verb is Wt (t is an arbitrary value from 1 to a-1), the word immediately after that is Wt + 1, and the word immediately after Wt + 1 is Wt + 2.

【００２２】ここで、部分複合名詞とは、構成する単語
の品詞や特性により複合名詞を分割して得られた各分割
単語列である。まず、時詞の中で複合名詞内で連体詞
（〜の）的に用いられることがある単語は直後の単語と
結び付く可能性が高い。そこで、時詞Ｗｔが複合名詞内
で連体詞（〜の）的に用いられることがあることを、時
詞Ｗｔと当該時詞の直後単語Ｗｔ＋１が結び付くための
必要条件とする。Here, the partial compound noun is each divided word string obtained by dividing the compound noun according to the part of speech and the characteristics of the constituent words. First, a word that is often used as an adjunct (to) in compound nouns is likely to be associated with the immediately following word. Therefore, the verb Wt may be used as an adjunct (to) in a compound noun as a necessary condition for connecting the verb Wt and the word Wt + 1 immediately after the verb.

【００２３】複合名詞内で連体詞（〜の）的に用いられ
るかどうかを表す情報は、単語の辞書情報に記述するこ
ととする。次に、Ｗｔ＋１の品詞に着目する。Ｗｔ＋１
が転成名詞（用言から転じた名詞、例：休み、生まれ、
上り）の場合は、ＷｔとＷｔ＋１の関係は構文的には連
用修飾であるものが、連体詞化したものであることが多
い。例えば、「正月休み」は「正月に休む」という連用
修飾を「正月の休み」と連体化したものである。一般に
時を表す単語は用語として働く語を修飾するので、この
結び付きは非常に多い。そこで、この場合には、必ず時
詞Ｗｔと直後単語Ｗｔ＋１を同じアクセント句とする。Information indicating whether or not the compound noun is used as an adjunct (to) is described in the word dictionary information. Next, pay attention to the part of speech of Wt + 1. Wt + 1
Is a transposed noun (a noun turned from a noun, eg, rest, born,
In the case of (upward), the relationship between Wt and Wt + 1 is syntactically modified, but is often converted to a noun. For example, "New Year holidays" is a continuation of the "holiday holidays" modifier combined with "New Year holidays." This connection is numerous because the word that represents time generally modifies the word that serves as the term. Therefore, in this case, the word Wt and the immediately following word Wt + 1 are always the same accent phrase.

【００２４】また、直後単語Ｗｔ＋１がサ変動詞型名詞
である場合も、同様に、構文的には連用修飾であるもの
が連体詞化したものが多い。例えば、「今期予想」は、
「今期に予想する」という連用修飾を、「今期の予想」
と連体詞化したものである。しかし、サ変動詞型名詞
は、その直後単語との結び付きが強い単語である。例え
ば、「今期予想利益」という複合名詞では「今期」と
「予想」の結び付きより「予想」と「利益」の結びつき
の方が強く、アクセント句は、「今期／予想利益」とな
る。そこで、Ｗｔ＋２が名詞である場合には、“Ｗｔと
Ｗｔ＋１”の結び付きより“Ｗｔ＋１とＷｔ＋２”の結
び付きの方が大きいものとし、Ｗｔ＋２が名詞でない、
或いは存在しない場合にのみ、“ＷｔとＷｔ＋１”を１
つのアクセント句とする。In addition, even when the immediately following word Wt + 1 is a sa noun verb type noun, similarly, in many cases, the syntactically modified word is a noun. For example, "Forecast for this term" is
"Forecast for this term" is a continuous modification of "Forecast for this term"
Is a continuum. However, the syllable-type noun is a word that has a strong connection with the word immediately after it. For example, in the compound noun "predicted profit for the current period", the connection between "predicted" and "profit" is stronger than the connection between "current period" and "forecast", and the accent phrase is "current period / expected profit". Therefore, when Wt + 2 is a noun, the connection between “Wt + 1 and Wt + 2” is greater than the connection between “Wt and Wt + 1”, and Wt + 2 is not a noun.
Or, only if it does not exist, set “Wt and Wt + 1” to 1
One accent phrase.

【００２５】また、Ｗｔ＋２が名詞である場合には、Ｗ
ｔを１つのアクセント句とする。Ｗｔ＋２が接尾辞の場
合には、ＷｔとＷｔ＋１とＷｔ＋２の結び付きは同程度
とし、ＷｔとＷｔ＋１とＷｔ＋２を１つのアクセント句
とする。また、Ｗｔ＋１が一般名詞である場合には、Ｗ
ｔとＷｔ＋１間で構文的な修飾構造を考えることができ
ない。そこで、単語の意味的な情報を利用する複合語内
係り受け解析を利用する。ＷｔからＷｔ＋１に対して一
般名詞係り受け（単語の意味的なつながりが強いことを
意味する）が成立することで、ＷｔとＷｔ＋１に結び付
きがあると判断する。If Wt + 2 is a noun, then W
Let t be one accent phrase. When Wt + 2 is a suffix, the ties of Wt, Wt + 1, and Wt + 2 are almost the same, and Wt, Wt + 1, and Wt + 2 are one accent phrase. If Wt + 1 is a general noun, W
No syntactic modification structure can be considered between t and Wt + 1. Therefore, we use dependency analysis in compound words that uses semantic information of words. It is determined that there is a connection between Wt and Wt + 1 when the general noun dependency (meaning that the semantic connection of words is strong) is established from Wt to Wt + 1.

【００２６】しかし、Ｗｔ＋２が名詞である場合には、
ＷｔとＷｔ＋１の結び付きよりＷｔ＋１とＷｔ＋２の結
び付きの方が強くなる場合が多い。そこで、Ｗｔ＋２が
名詞でなく、または存在しない場合に、ＷｔとＷｔ＋１
を１つのアクセント句とする。Ｗｔ＋２が名詞である場
合にはＷｔを１つのアクセント句とする。Ｗｔ＋２が接
尾辞の場合には、ＷｔとＷｔ＋１とＷｔ＋２の結びつき
は同程度とし、ＷｔとＷｔ＋１、Ｗｔ＋２を１つのアク
セント句とする。However, when Wt + 2 is a noun,
In many cases, the connection between Wt + 1 and Wt + 2 is stronger than the connection between Wt and Wt + 1. Therefore, if Wt + 2 is not a noun or does not exist, Wt and Wt + 1
Is one accent phrase. When Wt + 2 is a noun, Wt is one accent phrase. When Wt + 2 is a suffix, the ties of Wt, Wt + 1, and Wt + 2 are approximately the same, and Wt, Wt + 1, and Wt + 2 are one accent phrase.

【００２７】例えば、「来期目標」では、「来期」と
「目標」に一般名詞係り受けが成立する。この場合「来
期目標」が１アクセント句となる。しかし、「来期目標
台数」では、「来期」と「目標」間の結びつきより、
「目標」と「台数」間の結び付きの方が強く、アクセン
ト句は「来期／目標台数」となる。For example, in the "next term goal", the general noun dependency is established for the "next term" and the "goal". In this case, "next term goal" is one accent phrase. However, in the “target number for the next period”, the link between the “next period” and the “target”
The connection between "target" and "number" is stronger, and the accent phrase is "next term / target number".

【００２８】次に、本発明の合成音声出力装置の構成を
説明する。図３は、本発明の一実施例の合成音声出力装
置のブロック図を示す。同図に示す合成音声出力装置
は、形態素解析部１００、単語辞書１１０、読み・韻律
情報設定部２００、合成音声生成部３００から構成され
る。Next, the structure of the synthetic speech output device of the present invention will be described. FIG. 3 shows a block diagram of a synthesized voice output device according to an embodiment of the present invention. The synthesized speech output device shown in the figure comprises a morphological analysis unit 100, a word dictionary 110, a reading / prosodic information setting unit 200, and a synthesized speech generation unit 300.

【００２９】形態素解析部１００は、合成音声出力され
るべき漢字かな混じり文が入力されると、単語辞書１１
０を参照して、単語情報付き分かち書き単語列ｂを出力
する。単語情報付き分かち書き単語列ｂは、文を単語に
分割し、単語辞書１１０より取得した各単語に対する単
語情報（品詞、アクセント型、読み等）を加えたもので
ある。The morphological analysis unit 100 receives a kanji-kana mixed sentence to be output as a synthesized voice, and then receives the word dictionary 11
With reference to 0, the word-spacing word string b with word information is output. The separated word string b with word information is obtained by dividing a sentence into words and adding word information (part of speech, accent type, reading, etc.) for each word acquired from the word dictionary 110.

【００３０】読み・韻律情報設定部２００は、読み付与
部２１０、アクセント句分割部２２０、アクセント型設
定部２３０及び、ポーズ設定部２４０より構成される。
読み・韻律情報設定部２００は、単語情報付き分かち書
き単語列ｂを入力として韻律情報付きかな列を出力す
る。The reading / prosody information setting unit 200 comprises a reading adding unit 210, an accent phrase dividing unit 220, an accent type setting unit 230, and a pose setting unit 240.
The reading / prosodic information setting unit 200 receives the word-spacing word string b with word information as an input and outputs a kana string with prosody information.

【００３１】読み・韻律情報設定部２００の読み付与部
２１０は、単語情報付き分かち書き単語列ｂの各単語に
対して読みを付与する。読みとして主に単語情報の読み
情報を用いるが、連濁処理、数詞処理（桁読みまたは、
棒読みの決定）も併せて行う。The reading adding section 210 of the reading / prosody information setting section 200 adds reading to each word of the word-spacing word-spacing word string b. Reading information of word information is mainly used as reading, but rendaku processing, number processing (digit reading or,
(Determining stick reading) is also performed.

【００３２】アクセント句分割部２２０は、複合名詞内
で名詞とその直後の単語の意味的な結び付きが強い場合
に、この２つの単語が１つのアクセント句に成りやすい
という性質を利用して入力された文をアクセント句に分
割する。この処理の詳細は後述する。The accent phrase splitting unit 220 is inputted by utilizing the property that these two words are likely to become one accent phrase when the noun and the word immediately following it are strongly connected in the compound noun. The sentence is divided into accent phrases. Details of this processing will be described later.

【００３３】アクセント型設定部２３０は、アクセント
句分割部２２０において、複数の単語によりアクセント
句が構成されている場合に各アクセント句に対してアク
セント型を設定する。ポーズ設定部２４０は、各アクセ
ント句間に対して、ポーズの有無を設定する。そして、
ポーズがある場合にはその長さも設定する。このポーズ
設定部２４０からは、韻律情報付きカナ列ｃが出力され
る。The accent type setting unit 230 sets an accent type for each accent phrase when the accent phrase dividing unit 220 forms an accent phrase with a plurality of words. The pose setting unit 240 sets the presence / absence of a pose between the accent phrases. And
If there is a pose, set its length as well. From this pose setting unit 240, a kana string c with prosody information is output.

【００３４】ポーズ設定部２４０から出力された韻律情
報付きカナ列ｃは、合成音声出力されるべき文の読み情
報を表し、韻律情報としてアクセント型、ポーズ情報及
び男声／女声、ピッチレベル等をもつ。合成音声生成部
３００は、韻律情報付きカナ列ｃを入力として合成音声
を出力する。The prosodic information-added kana string c output from the pose setting section 240 represents reading information of a sentence to be output as synthesized speech, and has prosodic information such as accent type, pause information, male / female voice, and pitch level. . The synthetic speech generation unit 300 outputs a synthetic speech with the prosodic information-added kana sequence c as an input.

【００３５】次に、上記のアクセント句分割部２２０の
動作を詳細に説明する。図４は、本発明の一実施例のア
クセント句分割部の処理を説明するためのフローチャー
トである。以下の説明において、アクセント句分割は、
文頭から文末に順に処理を行うものとする。Next, the operation of the accent phrase dividing section 220 will be described in detail. FIG. 4 is a flow chart for explaining the processing of the accent phrase dividing unit according to the embodiment of the present invention. In the following explanation, accent phrase division is
Processing is performed in order from the beginning of the sentence to the end of the sentence.

【００３６】ステップ１００）セグメント分割処理
は、文をセグメントに分割する処理である。セグメント
先頭の単語が名詞（一般名詞、サ変動詞型名詞、形容動
詞型名詞、転成名詞、時詞、数詞、代名詞、固有名詞）
である場合、または、セグメント先頭が接頭辞で２番目
の単語が名詞である場合には、（接頭辞）＋名詞＋（連続する名詞または接尾辞）＋
（連続する付属語）を１セグメントとする。このセグメントを複合名詞句セ
グメントとする。Step 100) The segment division process is a process of dividing a sentence into segments. The word at the beginning of the segment is a noun (general noun, sa noun type noun, adjective verb type noun, metamorphic noun, toki, number, pronoun, proper noun)
Or if the beginning of the segment is a prefix and the second word is a noun, then (prefix) + noun + (consecutive nouns or suffixes) +
(Consecutive attached words) is one segment. This segment is a compound noun phrase segment.

【００３７】上記以外の場合には、（接頭辞）＋名詞以外の自立語＋（接尾辞）＋（連続す
る付属語）を１セグメントとする。このセグメントを独立的セグメ
ントとする。但し、上記の（）は、その単語が存在する
場合にのみ当該セグメントに含むことを意味する。ま
た、セグメントの総数をｎとする。In the cases other than the above, (prefix) + independent word other than noun + (suffix) + (consecutive adjunct) is defined as one segment. This segment is an independent segment. However, the above () means that the word is included in the segment only when it exists. The total number of segments is n.

【００３８】ここで、付属語には補助用言と形容名詞を
含める。複合名詞句セグメントは、複合名詞に付属語列
をつなげたものとなる。しかし、複合名詞内に接頭辞が
含まれる場合には、接頭辞の直前でセグメントが分割さ
れる。Here, auxiliary words include adjectives and adjectives. The compound noun phrase segment is a combination of a compound noun and an adjunct word string. However, if the prefix is included in the compound noun, the segment is divided just before the prefix.

【００３９】ステップ２００）セグメントカウンタｉ
に１を代入する。ステップ３００）第ｉ番目のセグメントＳｇ（ｉ）が
複合名詞句セグメントであるか判定する。複合名詞句セ
グメントである場合にはステップ４００に移行し、そう
でない場合には、ステップ６００に移行する。Step 200) Segment counter i
Substitute 1 for. Step 300) It is determined whether the i-th segment Sg (i) is a compound noun phrase segment. If the segment is a compound noun phrase segment, the process proceeds to step 400, and if not, the process proceeds to step 600.

【００４０】ステップ４００）複合名詞句アクセント
句分割処理は、複合名詞句セグメントのアクセント句分
割を行う。ステップ４１０）単語特性によるアクセン
ト句分割を行う。単語特性によるアクセント句分割処理
は、単語の品詞や単語固有の特性によって、アクセント
句を分割する処理である。単語固有の特性は、形態素解
析時に付与される各単語の辞書情報に基づく。この処理
例を以下に示す。・数字は、各桁毎にアクセント句に分
割する。Step 400) In the compound noun phrase accent phrase segmentation process, accent phrase segmentation of the compound noun phrase segment is performed. Step 410) The accent phrase is divided according to the word characteristics. The accent phrase division process based on word characteristics is a process of dividing accent phrases according to the part of speech of a word and the characteristic peculiar to the word. The characteristic peculiar to a word is based on the dictionary information of each word given at the time of morphological analysis. An example of this processing is shown below.・ Numbers are divided into accent phrases for each digit.

【００４１】（例）１２３（ヒャクニジューサン）→ヒ
ャク／ニジュー／サン・常に単独でアクセント句を構成する単語がある。（例）「元（モト）」→「田中／元／首相」、「元／代
表〕等・直前にアクセント句境界の入る単語がある。(Example) 123 (Hyaku Nijusan) → Hyakh / Nijou / San There are always words that independently form an accent phrase. (Example) "Moto" → "Tanaka / Former / Prime Minister", "Former / Representative", etc.-There is a word immediately before the accent phrase boundary.

【００４２】（例）「増強（ゾーキョー）」→「軍備／
増強」・直後にアクセント句境界の入る単語がある。（例）「以下（イカ）」→「課長以下／対象」このように、単語の品詞または、単語固有の特性により
当該セグメントに対して、アクセント句境界を設定す
る。そして、そのアクセント句境界によりＳｇ（ｉ）を
ｍ個の部分セグメントに分解する。(Example) “Strengthen (Zokyo)” → “Mission /
Augmentation ”・ There is a word immediately after the accent phrase boundary. (Example) “below (squid)” → “section chief or less / target” In this way, the accent phrase boundary is set for the segment according to the part of speech of the word or the characteristic peculiar to the word. Then, Sg (i) is decomposed into m sub-segments by the accent phrase boundary.

【００４３】ステップ４２０）単語パターンによるア
クセント句分割を行う。単語パターンによるアクセント
句分割処理は、部分セグメント内の複合名詞の中で結び
つきの強い単語群を単語パターンとして抽出し、そのパ
ターンを元にアクセント句境界を設定する処理である。
単語パターンとしては、本発明の対象となる時詞単語パ
ターン、及び固有名詞単語パターン、数詞助数単語パタ
ーン等がある。この処理の詳細は、図５で詳述する。本
ステップでは、ｍ個の部分セグメントを１つずつ処理し
ていく。この結果、セグメントＳｇ（ｉ）はｋ個（ｋ≧
ｍ）の部分セグメントに分解される。Step 420) Accent phrase division is performed according to the word pattern. Accent phrase division processing by word pattern is a process of extracting a word group having a strong connection among compound nouns in a partial segment as a word pattern and setting an accent phrase boundary based on the pattern.
Examples of word patterns include linguistic word patterns, proper noun word patterns, and numerical aid word patterns, which are objects of the present invention. Details of this processing will be described later with reference to FIG. In this step, the m partial segments are processed one by one. As a result, the number of segments Sg (i) is k (k ≧
m) is divided into sub-segments.

【００４４】ステップ４３０）残り単語数によるアク
セント句分割を行う。残り単語数によるアクセント句分
割処理は、ｋ個の部分セグメントのうち、単語パターン
が生成されなかった部分セグメントに対して、その部分
セグメントの単語数と単語の品詞等を用いてアクセント
句境界を設定する処理である。本ステップでは、前述の
第２の文献の単語間の結合力によるアクセント句抽出を
利用している。Step 430) The accent phrase is divided according to the number of remaining words. Accent phrase division processing based on the number of remaining words sets accent phrase boundaries for the partial segments for which no word pattern has been generated among the k partial segments, using the number of words of the partial segment and the word part of speech. It is a process to do. In this step, the accent phrase extraction based on the bond strength between words in the second document is used.

【００４５】上記のようにして、ステップ１０４では、
設定されたアクセント句境界によって、セグメントＳｇ
（ｉ）をアクセント句に分割する。ステップ５００）セグメントカウンタｉがｎであるか
を判定する。ｎである場合には、処理を終了する。ｎで
ない場合には、ステップ７００に移行する。As described above, in step 104,
The segment Sg is set according to the set accent phrase boundary.
Divide (i) into accent phrases. Step 500) It is determined whether the segment counter i is n. If it is n, the process ends. If not n, the process proceeds to step 700.

【００４６】ステップ６００）独立的セグメントのア
クセント句分割処理を行う。本ステップでは、基本的に
はＳｇ（ｉ）を１つのアクセント句とする。処理後ステ
ップ５００に移行する。ステップ７００）セグメントカウンタｉに１を加算す
る。処理後、ステップ３００に移行する。Step 600) Accent phrase division processing of an independent segment is performed. In this step, Sg (i) is basically used as one accent phrase. After processing, the process proceeds to step 500. Step 700) 1 is added to the segment counter i. After the processing, the process proceeds to step 300.

【００４７】次に、上記のステップ４２０の単語パター
ンによるアクセント句分割処理を詳細に説明する。図５
は、本発明の一実施例の単語パターンによるアクセント
句分割処理のフローチャートである。Next, the accent phrase division process by the word pattern in the above step 420 will be described in detail. Figure 5
FIG. 6 is a flowchart of accent phrase division processing by word patterns according to an embodiment of the present invention.

【００４８】ステップ４２１）初期設定処理として、
単語カウンタｊに１を代入する。ｐに該当部分セグメン
トの自立語単語数を代入する。ステップ４２２）当該部分セグメントの第ｊ番目の単
語Ｗｊの品詞が時詞であるか判定する。時詞である場合
にはステップ４２３に移行し、そうでない場合には、ス
テップ４２５に移行する。Step 421) As initial setting processing,
Substitute 1 for the word counter j. Substitute the number of independent-word words of the corresponding partial segment into p. Step 422) It is determined whether the part of speech of the jth word Wj of the partial segment is a participle. If it is a tongue, the process proceeds to step 423, and if not, the process proceeds to step 425.

【００４９】ステップ４２３）時詞の単語パターン生
成処理を行う。詳細については図６で詳述する。ステップ４２４）単語カウンタｊが自立語単語数ｐに
等しいか判定する。等しい場合にはステップ４２７に移
行し、等しくない場合には、ステップ４２６に移行す
る。Step 423) A word pattern generation process of time words is performed. Details will be described in detail in FIG. Step 424) It is determined whether the word counter j is equal to the number of independent word p. If they are equal, the process proceeds to step 427, and if they are not equal, the process proceeds to step 426.

【００５０】ステップ４２５）その他の単語パターン
生成処理を行う。この処理には固有名詞単語パターン生
成処理や、数詞助数詞単語パターン生成処理などがあ
る。処理後は、ステップ４２４に移行する。ステップ４２６）単語カウンタｊに１を加算し、ステ
ップ４２２に移行する。Step 425) Other word pattern generation processing is performed. This processing includes a proper noun word pattern generation processing and a numeral auxiliary number word pattern generation processing. After the processing, the process moves to step 424. (Step 426) The word counter j is incremented by 1, and the process proceeds to step 422.

【００５１】ステップ４２７）単語パターンへの接辞
繰り込み処理を行う。これは、生成された単語パターン
の同一部分セグメント内の直前単語が接頭辞である、あ
るいは、直後単語が接尾辞である場合に当該単語パター
ンにこの接頭辞／接尾辞を繰り込む処理である。Step 427) Affix renormalization processing to the word pattern is performed. This is a process of incorporating this prefix / suffix into the word pattern when the immediately preceding word in the same sub-segment of the generated word pattern is the prefix or the immediately following word is the suffix.

【００５２】ステップ４２８）単語パターンの付属語
繰り込み処理を行う。これは、生成された単語パターン
の直後単語（列）が付属語（列）の場合に当該単語パタ
ーンにこの付属語（列）を繰り込む処理である。次に、
上記のステップ４２３の時詞の単語パターン生成処理に
ついて説明する。図６は、本発明の一実施例の単語パタ
ーン生成処理のフローチャートである。Step 428) A word pattern adjunct word renormalization process is performed. This is a process in which, when the word (column) immediately after the generated word pattern is an adjunct word (column), this adjunct word (column) is carried into the word pattern. next,
The word pattern generation process of the verb in the above step 423 will be described. FIG. 6 is a flowchart of the word pattern generation process according to the embodiment of the present invention.

【００５３】ステップ４２３１）Ｗｊの複合語内連体
詞フラグがオンであるか判定する。オンである場合に
は、ステップ４２３２に移行する。オフである場合に
は、ステップ４２３９に移行する。ここで、複合語内連
体詞化フラグとは、単語情報の１つであり、図３の形態
素解析部１００の処理において、単語辞書１１０を参照
することにより得られる。そして、その単語が複合語内
で連体詞（〜の）的に使われるかどうかを表す。複合語
内連体詞化フラグがオンの単語とその複合語としては、
例えば、「大型」→「大型冷蔵庫」（大型の冷蔵庫）「近代」→「近代絵画」（近代の絵画）「クリスマス」→「クリスマス休暇」（クリスマスの休
暇）等がある。Step 4231) It is determined whether or not the compound in-word adjunct flag of Wj is ON. If it is on, the process moves to step 4232. If it is off, the process moves to step 4239. Here, the in-compound word adnominalization flag is one of word information, and is obtained by referring to the word dictionary 110 in the process of the morphological analysis unit 100 in FIG. Then, it indicates whether the word is used as an adnominal (to) in a compound word. The words for which the compound word adnominalization flag is on and their compound words are:
For example, "large" → "large refrigerator" (large refrigerator) "modern" → "modern painting" (modern painting) "Christmas" → "Christmas holiday" (Christmas holiday).

【００５４】ステップ４２３２）当該部分セグメント
の第ｊ＋１番目の単語Ｗｊ＋１の品詞が転成名詞である
か判定する。転成名詞であれば、ステップ４２３７に移
行する。ステップ４２３３）Ｗｊ＋１の品詞が一般名詞である
か判定する。一般名詞であればステップ４２３５に移行
する。そうでないならば、ステップ４２３４に移行す
る。Step 4232) It is judged whether the part of speech of the j + 1th word Wj + 1 of the partial segment is a metamorphic noun. If it is a transposition noun, the process proceeds to step 4237. Step 4233) It is determined whether the part of speech of Wj + 1 is a general noun. If it is a general noun, the process proceeds to step 4235. Otherwise, go to step 4234.

【００５５】ステップ４２３４）Ｗｊ＋１の品詞がサ
変動詞型名詞であるか判定する。サ変動詞型名詞であ
ば、ステップ４２３６に移行し、そうでないならば、ス
テップ４２３９に移行する。ステップ４２３５）ＷｊからＷｊ＋１に一般名詞係り
受けが成立するか判定する。成立する場合には、ステッ
プ４２３６へ移行し、成立しない場合にはステップ４２
３９に移行する。Step 4234) It is determined whether or not the part of speech of Wj + 1 is a sa-variative noun. If it is a sa-verb type noun, the process proceeds to step 4236. If not, the process proceeds to step 4239. (Step 4235) It is determined whether the general noun dependency is established from Wj to Wj + 1. When it is satisfied, the procedure proceeds to Step 4236, and when it is not satisfied, Step 42 is performed.
Move to 39.

【００５６】ここで、一般名詞係り受けとは一般名詞／
時詞／代名詞（前方単語とする）からその後方の一般名
詞あるいは接尾辞（後方単語とする）に対して成立する
複合語内の係り受けである。前方単語の後方承接意味カ
テゴリと後方単語の意味カテゴリがマッチする、あるい
は、後方単語の前方承接意味カテゴリと前方単語の意味
カテゴリがマッチする場合にこの係り受けが成立する。
意味カテゴリとは名詞を意味によって階層的に分類した
ものである。The general noun dependency is a general noun /
It is a dependency within a compound word that is established from a toki / pronoun (as a front word) to a general noun or a suffix (as a rear word) behind it. This dependency is established when the backward-sentence semantic category of the forward word and the backward-word semantic category match, or when the forward-sentence semantic category of the backward word and the forward-word semantic category match.
A semantic category is a hierarchical classification of nouns according to their meaning.

【００５７】ステップ４２３６）単語カウンタｊに１
加算した値がｐと等しい、或いは、当該部分セグメント
の第ｊ＋２番目の単語Ｗｊ＋２の品詞が接尾辞または、
付属語あるいは記号であるか判定する。条件を満たす場
合にはステップ４２３７へ移行し、満たさない場合には
ステップ４２３９に移行する。Step 4236) 1 is set to the word counter j.
The added value is equal to p, or the part of speech of the j + 2nd word Wj + 2 of the partial segment is the suffix or
Determine if it is an adjunct or a symbol. If the condition is satisfied, the process proceeds to step 4237, and if it is not satisfied, the process proceeds to step 4239.

【００５８】ステップ４２３７）ＷｊとＷｊ＋１をま
とめて、時詞単語パターンとする。ステップ４２３８）単語カウンタｊに１を加算する。ステップ４２３９）Ｗｊを時詞単語パターンとする。
処理後、時詞の単語パターン生成処理を終了する。Step 4237) Wj and Wj + 1 are put together to form a word word pattern. (Step 4238) 1 is added to the word counter j. Step 4239) Let Wj be a word word pattern.
After the processing, the word pattern generation processing of the time verb ends.

【００５９】次に、上記の図４、図５、図６のアクセン
ト句分割処理を図７の例文を用いて図８により説明す
る。入力される漢字かな混じり文は、『ニューヨーク州では、毎年クリスマス休暇頃に記録的
大寒波が襲う』が入力れ、形態素解析部１００により図８（Ａ）に示す
ように解析される。Next, the accent phrase division processing of FIGS. 4, 5, and 6 will be described with reference to FIG. 8 using the example sentence of FIG. The inputted kanji / kana mixed sentence is “In New York State, a record big cold wave strikes every Christmas holiday” is input, and is analyzed by the morphological analysis unit 100 as shown in FIG. 8 (A).

【００６０】次に、ステップ１００（図４）のセグメン
ト分割処理により、上記の文が５つのセグメントに分割
され、図８（Ｂ）のようにＳｇ（１）〜Ｓｇ（４）が複
合名詞句セグメント、Ｓｇ（５）が独立セグメントとな
る。次に、ｉ＝２の場合を考える。Next, the above sentence is divided into five segments by the segment division processing of step 100 (FIG. 4), and Sg (1) to Sg (4) are compound noun phrases as shown in FIG. 8 (B). The segment, Sg (5), becomes an independent segment. Next, consider the case where i = 2.

【００６１】図４のステップ３００において、Ｓｇ
（２）は複合名詞句セグメントであるので、ステップ４
１０の単語特定によるアクセント句分割処理に移行す
る。ここでは、該当する単語が存在しないので、何も処
理を行わない。次に、ステップ４２０の単語パターンに
よるアクセント句分割処理に移行する。そして、図５の
ステップ４２２において、Ｗ１＝「毎年」が時詞である
ので、ステップ４２３において、時詞の単語パターン生
成処理に移行する。In step 300 of FIG. 4, Sg
Since (2) is a compound noun phrase segment, step 4
The process shifts to accent phrase division processing by specifying 10 words. No processing is performed here because no corresponding word exists. Next, the process proceeds to the accent phrase division process based on the word pattern in step 420. Then, in step 422 of FIG. 5, since W1 = “every year” is the verb, in step 423, the process shifts to the word pattern generation process of the verb.

【００６２】次に、図６のステップ４２３１でＷ１＝
「毎年」の複合語内連体詞化フラグがオフであるので、
ステップ４２３９に移行する。そして、Ｗ１＝「毎年」
を時詞単語パターンとする。次に、図５のステップ４２
３に戻り、１≠ｐ（＝４）であるので、ステップ４２６
に移行する。そして、ｊ＝２としてステップ４２２に移
行する。Ｗ２＝「クリスマス」が時詞であるので、ステ
ップ４２３の時詞の単語パターン生成処理に移行する。Next, in step 4231 of FIG. 6, W1 =
Since the compound word in the compound word for "every year" is off,
The process moves to step 4239. And W1 = "every year"
Is a time word pattern. Next, step 42 of FIG.
Returning to step 3, since 1 ≠ p (= 4), step 426.
Move to. Then, j = 2 is set and the process proceeds to step 422. Since W2 = “Christmas” is the verb, the process moves to the vocabulary word pattern generation process in step 423.

【００６３】次に、図６のステップ４２３１において、
Ｗ２＝「クリスマス」の複合語内連体詞化フラグがオン
であるので、ステップ４２３２に移行する。ここで、Ｗ
３＝「休暇」＝一般名詞であるので、ステップ４２３３
からステップ４２３５に移行する。そして、「クリスマ
ス」→「休暇」に一般名詞係り受けが成立しているの
で、ステップ４２３６に移行する。ステップ４２３６で
は、Ｗ４＝「頃」の品詞が接尾辞であるので、ステップ
４２３７へ移行する。そして、「クリスマス休暇」を時
詞単語パターンとする。次に、ステップ４２３８に移行
し、ｊ＝３とする。Next, in step 4231 of FIG. 6,
Since the compound word internalization flag of W2 = “Christmas” is ON, the process proceeds to step 4232. Where W
3 = “vacation” = general noun, so step 4233
To Step 4235. Then, since the general noun dependency is established from "Christmas" to "holiday", the process proceeds to step 4236. In step 4236, since the part of speech of W4 = “koro” is the suffix, the process proceeds to step 4237. Then, "Christmas holidays" is used as a word word pattern. Next, the process proceeds to step 4238 and j = 3.

【００６４】次に、ステップ４２４に戻って、３≠ｐ
（＝４）であるので、ステップ４２６に移行する。そし
て、ｊ＝４としてステップ４２２に移行する。Ｗ４＝
「頃」が時詞でないので、ステップ４２５に移行する。
ステップ４２５では単語パターンが生成されず、ステッ
プ４２４に移行する。そして、４＝ｐ（＝４）であるの
で、ステップ４２７に移行する。ステップ４２７では、
時詞単語パターン「クリスマス休暇」に「頃」を繰り込
み、「クリスマス休暇頃」としてステップ４２８に移行する。ステップ４２８で
は、「クリスマス休暇頃」に「に」を繰り込み、「クリスマス休暇頃に」とする。Next, returning to step 424, 3 ≠ p
Since (= 4), the process proceeds to step 426. Then, j = 4 is set and the process proceeds to step 422. W4 =
Since "koro" is not a verb, the process proceeds to step 425.
No word pattern is generated in step 425, and the process proceeds to step 424. Since 4 = p (= 4), the process proceeds to step 427. In step 427,
The time word word pattern “Christmas holiday” is included in the term “about”, and the time shifts to step 428 as “Christmas holiday”. In step 428, "ni" is added to "around Christmas holidays" to be "around Christmas holidays".

【００６５】次に、図４のステップ４３０の残り単語数
によるアクセント句分割処理に移行する。ここでは、単
語パターンが生成されなかった部分セグメントが存在し
ないので、処理を行わない。以上の一連の処理によりセ
グメントＳｇ（２）に対する処理が終了する。Next, the process proceeds to the accent phrase division process based on the number of remaining words in step 430 of FIG. Here, since there is no partial segment for which a word pattern has not been generated, processing is not performed. The series of processes described above completes the process for the segment Sg (2).

【００６６】他のセグメントについても同様に上記の図
４、図５、図６の各フローチャートに示す処理を行うこ
とにより、アクセント句分割を行うことができる。な
お、本発明は、上記実施例に限定されることなく特許請
求の範囲内で種々変更・応用が可能である。Similarly, for other segments, accent phrase division can be performed by performing the processes shown in the flow charts of FIGS. 4, 5, and 6 described above. The present invention is not limited to the above-described embodiments, and various modifications and applications are possible within the scope of the claims.

【００６７】[0067]

【発明の効果】上述のように本発明の合成音声出力方法
及び装置によれば、従来一意的に扱われていた時詞のア
クセント句分割を、時詞が連体詞的に用いられるかとい
う情報と、複合名詞内の係り受け解析の情報と、複合名
詞を構成する単語の品詞及びアクセント句に未分割の複
合語内の単語数を用いて、時詞の単語パターン生成処理
を行う。これにより、時詞を含む複合名詞において、よ
り高精度にアクセント句分割することができ、より自然
な合成音声出力が可能となる。As described above, according to the synthetic speech output method and apparatus of the present invention, the accent phrase division of time words, which has been treated uniquely in the past, is provided with information as to whether the time words are used as adjuncts. , The word pattern generation process of time words is performed by using the information of the dependency analysis in the compound noun, and the number of words in the compound word which is not divided into the part of speech and the accent phrase of the words forming the compound noun. As a result, in a compound noun including a syllable, accent phrases can be divided with higher accuracy, and more natural synthesized speech output is possible.

[Brief description of drawings]

【図１】本発明の原理を説明するための図である。FIG. 1 is a diagram for explaining the principle of the present invention.

【図２】本発明の原理構成図である。FIG. 2 is a principle configuration diagram of the present invention.

【図３】本発明の一実施例の合成音声出力装置のブロッ
ク図である。FIG. 3 is a block diagram of a synthesized voice output device according to an embodiment of the present invention.

【図４】本発明の一実施例のアクセント句分割部の処理
を説明するためのフローチャートである。FIG. 4 is a flowchart for explaining the processing of the accent phrase dividing unit according to the embodiment of the present invention.

【図５】本発明の一実施例の単語パターンによるアクセ
ント句分割処理のフローチャートである。FIG. 5 is a flowchart of accent phrase division processing using word patterns according to an embodiment of the present invention.

【図６】本発明の一実施例の単語パターン生成処理のフ
ローチャートである。FIG. 6 is a flowchart of word pattern generation processing according to an embodiment of the present invention.

【図７】本発明の一実施例の合成音声処理の例を示す図
である。FIG. 7 is a diagram showing an example of synthetic speech processing according to an embodiment of the present invention.

【図８】本発明の一実施例のセグメント分割処理を説明
するための図である。FIG. 8 is a diagram for explaining segment division processing according to an embodiment of the present invention.

【図９】本発明の一実施例の複合名詞句アクセント句分
割処理を説明するための図である。FIG. 9 is a diagram for explaining a compound noun phrase accent phrase segmentation process according to an embodiment of the present invention.

[Explanation of symbols]

１００形態素解析部、単語解析手段１１０単語辞書２００読み・韻律情報設定部２１０読み付与部，読みがな付与手段２２０アクセント句分割部，アクセント句分割手段２３０アクセント型設定部，アクセント型設定手段２４０ポーズ設定部，ポーズ設定手段３００合成音声生成部，音声合成手段 100 Morphological analysis unit, word analysis means 110 word dictionary 200 Reading / prosodic information setting section 210 reading adding unit, reading reading adding means 220 Accent phrase division unit, accent phrase division means 230 Accent Type Setting Unit, Accent Type Setting Means 240 Pose setting part, Pose setting means 300 synthetic speech generation unit, speech synthesis means

フロントページの続き (56)参考文献特開平２−93499（ＪＰ，Ａ) 特開平２−5097（ＪＰ，Ａ) 特開平３−37700（ＪＰ，Ａ) 特開平３−58097（ＪＰ，Ａ) 特開平４−36799（ＪＰ，Ａ) 特開平５−134692（ＪＰ，Ａ) 特開平６−118981（ＪＰ，Ａ) 特開平６−149282（ＪＰ，Ａ) (58)調査した分野(Int.Cl.⁷，ＤＢ名) G10L 13/00 - 13/08 G06F 17/21 - 17/28 Continuation of front page (56) Reference JP-A-2-93499 (JP, A) JP-A-2-5097 (JP, A) JP-A-3-37700 (JP, A) JP-A-3-58097 (JP , A) JP 4-36799 (JP, A) JP 5-134692 (JP, A) JP 6-118981 (JP, A) JP 6-149282 (JP, A) (58) Fields investigated (Int.Cl. ⁷ , DB name) G10L 13/00-13/08 G06F 17/21-17/28

Claims

(57) [Claims]

1. A sentence containing Japanese kanji and kanaJapanese document of
In the synthetic speech output method that reads aloud with synthetic speech, Each word is recognized and recognized from the input Japanese kana mixed sentence.
Morphological analysis is performed on the specified words, and
Word analysis processing to obtain information of Each word analyzed in the word analysis process and the above for each word
Using the information, add a phonetic kana to the Japanese kana mixed sentence
Yomi kana giving process to give, Toki is used as a conjunctive sentence in the Japanese kana mixed sentence
Information of whether or not, dependency information in compound noun, the compound noun
Using the part-of-speech of the words that make up and the number of words in the compound
Accent phrase division processing to divide into accent phrases, Set the accent type for each divided accent phrase.
The xent type setting process, Set the pose information between the divided accent phrases.
Setting process ,The reading kana given in the reading kana addition process,
Using the cent type and the pose information, whether it is Japanese or not
Synthetic speech with prosodic information corresponding to various mixed sentences
Consists of voice synthesis processing, The accent phrase division processing is The verbs in a given compound word are
If it is, it depends on the type of part-of-speech of the word immediately preceding the time verb.
A word pattern that changes the word pattern generated by
Identification change process, The word preceding the verb is the word pattern identification change
Depending on the processing, the sa verb type noun or the dependency relation
It is judged to be a general noun, and it is a single noun that is one after the time.
When a word is at the end of the compound word, or
If the second word after is a suffix, adjunct, or symbol
In this case, the word and the word immediately preceding
To perform the trailing edge determination process that determines that the
A synthetic speech output method characterized by:

2.A Japanese document with a mixture of Japanese kanji and kana
A synthetic voice output device for reading out in a synthetic voice, Each word is recognized and recognized from the input Japanese kana mixed sentence.
For a given word Morphological analysis is performed for each word
Word analysis means for obtaining information of Each word analyzed by the word analysis means and the above for each word
Using the information, add a phonetic kana to the Japanese kana mixed sentence
Yomikana giving means to give, Toki is used as a conjunctive sentence in the Japanese kana mixed sentence
Information of whether or not, dependency information in compound noun, the compound noun
Using the part-of-speech of the words that make up and the number of words in the compound
Accent phrase dividing means for dividing into accent phrases, Set the accent type for each divided accent phrase.
Xent type setting means, Set the pose information between the divided accent phrases.
Setting means ,The reading kana given by the reading kana giving means,
Using the cent type and the pose information, whether it is Japanese or not
Synthetic speech with prosodic information corresponding to various mixed sentences
And a voice synthesizing means, The accent phrase dividing means, The verbs in a given compound word are
If it is, it depends on the type of part-of-speech of the word immediately preceding the time verb.
A word pattern that changes the word pattern generated by
Identification change means, The word preceding the verb is the word pattern identification change
Depending on the means, the sa verb noun or dependency
It is judged to be a general noun, and it is a single noun that is one after the time.
When a word is at the end of the compound word, or
If the second word after is a suffix, adjunct, or symbol
In this case, the word and the word immediately preceding
It must have a tail decision means to decide that the
And a synthesized voice output device.