JPH0562356B2

JPH0562356B2 -

Info

Publication number: JPH0562356B2
Application number: JP59071753A
Authority: JP
Inventors: Sadaichi Watanabe; Norimasa Nomura
Original assignee: Tokyo Shibaura Electric Co Ltd
Current assignee: Toshiba Corp
Priority date: 1984-04-12
Filing date: 1984-04-12
Publication date: 1993-09-08
Also published as: JPS60216394A

Description

[Detailed description of the invention]

〔発明の技術分野〕本発明は、文字列を音声に変換する音声合成方
式に関する。〔発明の技術的背景とその問題点〕文を音声に変換する音声合成装置においては、
文字列を解析して音韻情報および韻律情報を生成
することが必要である。この両者のうち韻律情報
について大きなウエイトを占るものが、単語のア
クセントに関する情報である。単語のアクセントの検定は、単語が単純語の場
合と複合語である場合とでは、むずかしさのレベ
ルが異なり、複合語がめんどうであることはいう
までもない。従来、複合単語のアクセントの検定に関して、
複合単語を構成している単語つまり構成単語のア
クセント型（個々の構成単語に固有である）にも
とづいて、規則的に複合単語のアクセント型をき
めることが通常行われていた。しかし複合語の中
には、上記の規則的な方法ではアクセントを検定
できないものがあり、そのような複合語の場合、
誤まつたアクセント情報が生成され、その結果合
成音声の了解度が低下し、また自然性をそこね
た。これが従来の方法における問題点であつた。〔発明の目的〕本発明の目的は、文字列を音声に変換する音声
規則合成方式として、複合語のアクセントの検定
を高精度で行うことにより、従来より、了解度お
よび自然性の高い合成音声を生成する方式を提供
することにある。〔発明の概要〕本発明は、漢字かな混りの日本語文を文節単位
に区切り、文節の文字列を単語辞書と照合し、照
合できた文字列を上記文節文字列からとり除いた
残りの文字列の先頭が漢字の場合、複合語として
の処理を行い、一方漢字でない場合単純語として
の処理を行い、各々の処理において音韻情報およ
び韻律情報の生成を行い、それらの情報にもとづ
いて音韻パラメータおよび韻律パラメータを作成
し、合成器に両者のパラメータを送り合成音声を
得るが、上記２つの処理のうち複合語処理におい
ては、複合語として照合できた場合、複合語の後
部構成単語について後接続アクセント情報を単語
辞書から読み出し、その情報を用いて複合語のア
クセントを検定することにより、高精度の検定を
可能にし、これにより、合成音声における複合語
について正しいアクセントが与えられ、合成音声
の了解度および自然性を高めるものである。〔発明の効果〕本発明によつて、日本語文においてよく用いら
れている複合部について、従来より高精度でアク
セントの検定ができ、その結果複合語の合成音声
の自然性および了解度が向上できる。とくに文の
中で複合語がキーワードになつている場合もしば
しばあり、そのような場合、本発明は、大きな効
果をもつ。〔発明の実施例〕本発明の実施例の１つを図１に示す。日本語文の漢字かな混りの文字列を文節切り回
路１により、文字種の変化（例えば、かな→漢
字）などにしたがつて文節単位に区切る。単語照
合回路２は、上記の文節文字列を単語辞書３と照
合し、文節から照合できた単語をとり除く。その
結果得られた、残り文字列の先頭が漢字のとき複
合照合回路４を駆動する。他方先頭が漢字でない
とき単語辞書３から照合できた単語の音韻情報お
よびアクセント型を読み出し、それにもとづいて
音韻系列回路６および韻律パラメータ生成回路９
を駆動する。複合語照合回路４は、上記の残り文字列を単語
辞書３と照合し、照合できた場合、単語辞書３か
ら単語の音韻情報を読み出し、それにもとづいて
音韻系列検定回路６を駆動するとともに、単語辞
書３から同時に後接続アクセント情報も読み出
し、それにもとづいて複合語アクセント検定回路
５を駆動する。複合語アクセント検定回路５は、複合語の後部
構成単語の後接続アクセント情報を用いて、複合
語のアクセントを検定する。例えば複合語〔都市
建設〕の場合、前部構成単語〔都市〕が単語照合
回路２で照合され、後部構成単語〔建設〕が複合
照合回路４で照合されたとき、後部構成単語〔建
設〕の後接続アクセント情報をとり出すと、〔１〕
である。この数値〔１〕は、複合語のアクセント
核が、後部構成単語の第１モーラになることを示
すため、複合語〔トシケンセツ〕のアクセント核
が、後部構成単語の第１モーラである〔ケ〕にな
ることが検定できる。すなわち複合語のアクセン
トは、〔トシケンセツ〕であり、第３モーラにア
クセント核をもつ、いわゆる３型であることが検
定できる。同様な手順で、複合語〔悪戦苦闘〕について
も、後部構成単語〔苦闘〕の後接続アクセント情
報が [Technical Field of the Invention] The present invention relates to a speech synthesis method for converting character strings into speech. [Technical background of the invention and its problems] In a speech synthesis device that converts sentences into speech,
It is necessary to analyze character strings to generate phonological and prosodic information. Of the two, information regarding the accent of words has a large weight in prosodic information. Testing the accent of a word has different levels of difficulty depending on whether the word is a simple word or a compound word, and it goes without saying that compound words are troublesome. Conventionally, regarding the test of accent of compound words,
It has been common practice to regularly determine the accent type of a compound word based on the words that make up the compound word, that is, the accent type of the constituent words (which is unique to each constituent word). However, some compound words cannot be tested for accent using the regular method described above, and in the case of such compound words,
Erroneous accent information was generated, and as a result, the intelligibility of the synthesized speech decreased, and the naturalness was also impaired. This was a problem with conventional methods. [Object of the Invention] The object of the present invention is to use a speech rule synthesis method for converting character strings into speech to test the accent of compound words with high accuracy, thereby creating synthesized speech with higher intelligibility and naturalness than ever before. The objective is to provide a method for generating . [Summary of the Invention] The present invention divides a Japanese sentence containing kanji and kana into phrases, compares the character strings of the phrases with a word dictionary, and removes the matched character strings from the phrase string. If the first character in the column is a kanji, it is processed as a compound word, whereas if it is not a kanji, it is processed as a simple word. In each process, phonological information and prosody information are generated, and phonological parameters are calculated based on this information. and prosodic parameters are created, and both parameters are sent to the synthesizer to obtain synthesized speech. However, in the compound word processing of the above two processes, if the word can be matched as a compound word, the latter constituent words of the compound word are post-connected. By reading out accent information from a word dictionary and testing the accent of compound words using that information, highly accurate testing becomes possible.This allows the correct accent to be given to compound words in synthesized speech, and improves understanding of synthesized speech. It enhances the quality and naturalness. [Effects of the Invention] According to the present invention, it is possible to test accents with higher accuracy than before for compound parts that are often used in Japanese sentences, and as a result, the naturalness and intelligibility of synthesized speech of compound words can be improved. . In particular, compound words are often used as keywords in sentences, and in such cases, the present invention has great effects. [Embodiments of the Invention] One embodiment of the present invention is shown in FIG. A character string containing kanji and kana in a Japanese sentence is divided into phrases by a phrase segmentation circuit 1 according to a change in character type (for example, from kana to kanji). The word matching circuit 2 matches the phrase character string with the word dictionary 3, and removes the words that could be matched from the phrase. When the head of the remaining character string obtained as a result is a kanji character, the composite collation circuit 4 is driven. On the other hand, if the first character is not a kanji character, the phonological information and accent type of the word that could be matched are read from the word dictionary 3, and based on that, the phonological sequence circuit 6 and the prosodic parameter generating circuit 9
to drive. The compound word matching circuit 4 matches the above-mentioned remaining character strings with the word dictionary 3, and if matching is successful, reads the phonological information of the word from the word dictionary 3, drives the phonological sequence testing circuit 6 based on it, and also Post-connection accent information is also read out from the dictionary 3 at the same time, and the compound word accent test circuit 5 is driven based on it. The compound word accent test circuit 5 tests the accent of a compound word using the subsequent accent information of the rear constituent words of the compound word. For example, in the case of a compound word [urban construction], when the front constituent word [city] is matched by the word matching circuit 2 and the latter constituent word [construction] is matched by the compound matching circuit 4, the latter constituent word [construction] is matched by the compound word matching circuit 4. When extracting the post-connection accent information, [1]
It is. This value [1] indicates that the accent nucleus of the compound word is the first mora of the rear constituent word, so the accent nucleus of the compound word [toshikensetsu] is the first mora of the rear constituent word [ke] It can be verified that In other words, the accent of the compound word is [toshikensetsu], and it can be verified that it is a so-called type 3, with the accent core in the third mora. In the same way, for the compound word ``struggle'', the post-conjunctive accent information of the last component word ``struggle'' is obtained.

〔０〕であり、この数値[0], and this number

〔０〕は、複合語が
全体としてどのモーラにもアクセント核がない平
板型となることを示すため、この複合語のアクセ
ントは、〔アクセントー〕であり、Ｏ型であるこ
とが検定できる。従来のアクセント検定の方法の場合、複合語の
アクセントは、後部構成単語のモーラ数およびそ
の構成単語が単純語として用いられるときの固有
アクセント型にもとづいて規則的に検定を行うた
め、すなわち後接続アクセント情報をもたないた
め、上記の例〔悪戦苦闘〕のアクセントは、〔ア
クセンクトー〕つまり５型であると検定される。
したがつて誤まつた韻律情報が韻律パラメータ生
成回路に送られる。本方法では、上述のように、従来アクセント検
定が誤りとなる複合語についても正しい検定を行
うことができ、すなわち高精度で複合語のアクセ
ントを検定することが可能となる。つぎに音韻系列検定回路６は、単語の音韻情報
にもとづいて音韻の鼻音化および無声化の有無の
検定を行い、音韻系列を生成し、音韻パラメータ
生成回路７が音声素片フアイル８を用いて上記音
韻系列にもとづいて音韻パラメータの列を作り、
これを音声合成器１０へ送る。また韻律パラメータ生成回路９は、前記の単純
語および複合語の韻律情報にもとづいて韻律パラ
メータの列を生成し、音声合成器１０へ送る。音声合成器１０は上記の音韻パラメータ列およ
び韻律パラメータ列によつて駆動され、合成音声
が出力される。以上の方式により、複合語を含む日本語文につ
いて、自然性および了解度の高い合成音声が得る
ことができる。本発明は、漢字かな混り文の合成音声を生成す
ることに限らず、カナキーを用いて文節単位に入
力されたかな文字列の合成音声を生成することに
ついても、多少の変形を加えることにより、応用
することができる。[0] indicates that the compound word as a whole has a flat type with no accent core in any mora, so it can be verified that the accent of this compound word is [accent-] and that it is O type. In the case of conventional accent testing methods, the accent of compound words is regularly tested based on the number of moras of the post-component words and the unique accent type when the constituent words are used as simple words. Since it does not have accent information, the accent in the above example [Struggling] is determined to be [Accent To], that is, type 5.
Therefore, the erroneous prosody information is sent to the prosody parameter generation circuit. As described above, in this method, it is possible to perform a correct test even on a compound word for which conventional accent tests are incorrect, that is, it is possible to test the accent of a compound word with high accuracy. Next, the phoneme sequence testing circuit 6 tests the presence or absence of nasalization and devoicing of the phoneme based on the phoneme information of the word, generates a phoneme sequence, and the phoneme parameter generation circuit 7 uses the phoneme file 8 to Create a sequence of phonological parameters based on the above phonological sequence,
This is sent to the speech synthesizer 10. Furthermore, the prosodic parameter generation circuit 9 generates a sequence of prosodic parameters based on the prosodic information of the simple words and compound words, and sends them to the speech synthesizer 10. The speech synthesizer 10 is driven by the above phonetic parameter string and prosodic parameter string, and outputs synthesized speech. By the above method, it is possible to obtain synthesized speech with high naturalness and intelligibility for Japanese sentences containing compound words. The present invention is not limited to generating synthesized speech of kanji-kana mixed sentences, but also generates synthesized speech of kana character strings input in units of clauses using Kana Key, by adding some modifications. , can be applied.

[Brief explanation of the drawing]

第１図は本発明の一実施例を示す概略構成図、
第２図は単語辞書の形式を示す図である。１……文節切り回路、２……単語照合回路、３
……単語辞書記憶装置、４……複合語照合回路、
５……複合語アクセント検定回路、６……音韻系
列検定回路、７……音韻パラメータ生成回路、８
……音声素片フアイル記憶装置、９……韻律パラ
メータ生成回路、１０……音声合成器。 FIG. 1 is a schematic configuration diagram showing an embodiment of the present invention;
FIG. 2 is a diagram showing the format of the word dictionary. 1...Bunsetsu cutting circuit, 2...Word matching circuit, 3
...Word dictionary storage device, 4...Compound word matching circuit,
5...Compound word accent testing circuit, 6...Phonological sequence testing circuit, 7...Phonological parameter generation circuit, 8
. . . speech segment file storage device, 9 . . . prosodic parameter generation circuit, 10 . . . speech synthesizer.

Claims

[Claims]

1. A bunsetsu cutting circuit that separates character strings containing kanji and kana in Japanese sentences into bunsetsu units, a word matching circuit that matches the strings of clauses with a word dictionary and extracts phonological and prosodic information, and a word matching circuit that extracts phonological and prosodic information by comparing the strings of clauses with a word dictionary. When the first character string after removing the character string from the bunsetsu is a kanji, the compound word matching circuit checks the remaining string against the word dictionary again, and the accent of the string that was matched as a compound word is a compound word accent testing circuit that tests using post-conjunctive accent information extracted from the word dictionary during the compound word matching; and a phonology that tests a phonological sequence based on the matching results of the word matching circuit or the compound word matching circuit. a sequence testing circuit, a phonological parameter generation circuit that converts the phonological sequence into a phonological parameter, a prosodic parameter generation circuit that generates a prosodic parameter based on the test result of the word matching circuit or compound word accent testing circuit, and the phonological parameter. and a speech synthesizer that generates synthesis parameters based on prosodic parameters.