JPH09204434A

JPH09204434A - Device and method for synthesizing voice and recording medium

Info

Publication number: JPH09204434A
Application number: JP8010399A
Authority: JP
Inventors: Yoshiaki Teramoto; 良明寺本; Nobuyuki Katae; 伸之片江; Hidetoshi Tsujiuchi; 秀敏辻内; Akihiro Kimura; 晋太木村
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1996-01-24
Filing date: 1996-01-24
Publication date: 1997-08-05
Anticipated expiration: 2016-01-24
Also published as: JP3983313B2

Abstract

PROBLEM TO BE SOLVED: To exactly read sentence information, in which plural kinds of reading are existent corresponding to one description, by providing a dictionary in hierarchical structure hierarchizing the information of descriptions and reading of addresses, etc., for which plural kinds of reading are existent corresponding to one description. SOLUTION: Corresponding to the input of a sentence input part 1, a hierarchical dictionary retrieval part 2 retrieves candidates for the reading of a word contained in inputted sentence information from a stored hierarchical dictionary 3 together with the information in hierarchical structure hierarchizing the information showing the descriptions and reading of words according to the connection order of words by words such as addresses composed of word groups for which the reading of a word to appear next is decided by a word to appear first. Based on the information in the hierarchical structure, a sentence parsing part 4 selects the reading of a word string coincident with the word string in the hierarchical dictionary 3 contained in the sentence information out of the candidates of reading and converts the selected word string to the reading of the dictionary 3. A voice waveform generating part 6 generates a voice waveform from the information of reading. Thus, the word string of the address or the like is read in correct reading.

Description

Detailed Description of the Invention

【０００１】[0001]

【発明の属する技術分野】本発明は、一つの表記に複数
の読みが存在する住所等の文章情報を正確な読みで読み
上げる音声合成装置、音声合成方法及びこの方法のコン
ピュータプログラムが記録された記録媒体に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a voice synthesizing apparatus, a voice synthesizing method, and a computer program recording the voice synthesizing method for reading out sentence information such as an address having a plurality of readings in one notation with correct reading. Regarding the medium.

【０００２】[0002]

【従来の技術】従来の音声合成装置では、文章情報が入
力されると、単語の表記に対応付けて、単語の読み、音
声を合成するためのアクセント等の情報が格納されてい
る辞書を参照して文章情報を読みに変換し、読みの情報
及びアクセント等の情報から音声波形を生成して文章情
報を合成音声で読み上げる。2. Description of the Related Art In a conventional speech synthesizer, when sentence information is input, it refers to a dictionary that stores information such as word reading and accents for synthesizing speech in association with word notation. Then, the text information is converted into reading, a voice waveform is generated from the reading information and the information such as accent, and the text information is read aloud by synthetic speech.

【０００３】[0003]

【発明が解決しようとする課題】しかし、一つの表記に
複数の読みが存在する住所等が文章情報に含まれている
場合、例えば「本町」という表記の文章情報を「ホ′ン
チョー」「ホ′ンマチ」「モト′マチ」（「′」はアク
セント記号）といった複数の読みのいずれの読みで読み
上げるべきかを正確に決定できずに誤った読みで読み上
げてしまう可能性がある。However, when the text information includes an address or the like in which a plurality of readings are present in one notation, for example, the text information of "Honmachi" is changed to "Honcho" and "Ho". There is a possibility that it may not be possible to accurately determine which of a plurality of readings such as “n'machi” and “moto'machi” (“′” is an accent symbol) to be read out, and may be read out with an incorrect reading.

【０００４】本発明はこのような問題点を解決するため
になされたものであって、一つの表記に対して複数の読
みが存在する住所等の表記及び読みの情報を階層化した
階層構造の辞書を持つことにより、一つの表記に対して
複数の読みが存在する住所等の文章情報を正確な読みで
読み上げる音声合成装置、音声合成方法及びこの方法の
コンピュータプログラムを記録している記録媒体の提供
を目的とする。The present invention has been made in order to solve such a problem, and has a hierarchical structure in which notation such as an address having plural readings for one notation and reading information are hierarchized. By having a dictionary, a voice synthesizing device, a voice synthesizing method and a recording medium recording a computer program of this method for reading out sentence information such as an address in which a plurality of readings exist for one notation with correct reading. For the purpose of provision.

【０００５】[0005]

【課題を解決するための手段】図１は本発明の音声合成
装置の基本ブロック図である。ＣＤ−ＲＯＭ，光磁気デ
ィスク等の記録媒体から直接的に、又は公衆回線等を介
して文章情報を入力する文章入力部１が文章情報を入力
すると、階層辞書検索部２は、先に出現する単語によっ
て該単語の次に出現する単語の読みが決まる単語群から
なる住所等の単語列の各単語の表記及び読みの情報が、
単語を接続順、例えば住所であれば都道府県・市郡・町
村区の順に従って階層化した階層構造の情報とともに格
納されている階層辞書３から、入力された文章情報に含
まれている単語の読みの候補を検索し、文章解析部４
が、読みの候補の中から、文章情報に含まれている、階
層辞書３中の単語列に一致する単語列の読みを階層構造
の情報に基づいて選択し、この単語列を階層辞書３中の
読みに変換する文章解析部４と、読みの情報から音声波
形を生成する音声波形生成部６と、音声波形を出力する
スピーカ７とを備える。これにより、一つの表記に複数
の読みが存在する住所等の単語列を正しい読みで読み上
げる。FIG. 1 is a basic block diagram of a speech synthesizer according to the present invention. When the text input unit 1 for inputting text information directly from a recording medium such as a CD-ROM or a magneto-optical disk or via a public line or the like inputs text information, the hierarchical dictionary search unit 2 appears first. Information about the notation and reading of each word of a word string such as an address formed of a group of words that determines the reading of the word that appears next to the word,
From the hierarchical dictionary 3 that is stored together with the information of the hierarchical structure in which the words are connected in order, for example, in the case of an address, the order of prefectures, city counties, and towns and villages is stored. Searching for reading candidates, sentence analysis unit 4
Is selected from among the reading candidates based on the hierarchical structure information, the reading of the word string that is included in the sentence information and matches the word string in the hierarchical dictionary 3, and this word string is selected in the hierarchical dictionary 3. A sentence analysis unit 4 for converting the reading into a reading, a speech waveform generating unit 6 for generating a speech waveform from reading information, and a speaker 7 for outputting the speech waveform. As a result, a word string such as an address having plural readings in one notation is read aloud with the correct reading.

【０００６】また、本発明の音声合成装置は、地域を特
定する情報の入力、文章情報に含まれる地名からの判定
等に基づいて地域を指定することにより、指定された地
域に属する階層から検索を開始して検索時間を短縮す
る。Further, the speech synthesizer of the present invention searches the hierarchy belonging to the designated area by inputting the information for identifying the area and designating the area based on the judgment from the place name included in the sentence information. To reduce the search time.

【０００７】また、本発明の音声合成装置は、階層構造
の情報が、その読みを決定する各単語の上位の階層の親
単語を特定する情報である階層辞書と、階層辞書から検
索した読みの候補のこの情報を参照して文章情報に含ま
れる単語列の読みの候補の接続関係を設定する。Further, in the speech synthesizing apparatus of the present invention, the hierarchical structure information is a hierarchy dictionary which is information for identifying a parent word in a higher hierarchy of each word which determines the reading, and a reading dictionary retrieved from the hierarchical dictionary. By referring to this information of the candidates, the connection relation of the candidates for reading the word string included in the sentence information is set.

【０００８】また、本発明の音声合成装置は、単語列の
いずれかの階層の単語の表記が省略されている単語列を
基に階層辞書を検索することにより、文書中の単語列の
一部が省略されている場合でも正しい読みで読み上げ
る。Further, the speech synthesizer of the present invention searches a hierarchical dictionary on the basis of a word string in which notation of a word in any one of the word strings is omitted, and thereby a part of the word string in the document is searched. Read correctly even if is omitted.

【０００９】また、本発明の音声合成装置は、所定数以
上の文字又は単語の表記が階層辞書に格納されている単
語列に含まれる該文字又は単語の表記と一致する場合に
階層辞書の読みを該文字又は単語の読みと判定すること
により、複数の読みが存在する住所等の単語列以外で、
その表記がこの単語列の一部と一致する一般の単語を住
所等の読みで誤って読み上げることがない。The speech synthesizer of the present invention reads the hierarchical dictionary when the notation of a predetermined number or more of characters or words matches the notation of the characters or words included in the word string stored in the hierarchical dictionary. Is determined to be the reading of the character or word, except for a word string such as an address in which a plurality of readings exist,
A general word whose notation matches a part of this word string is not erroneously read out when reading an address or the like.

【００１０】また、本発明の音声合成装置は、単語列に
接続される接尾語の表記及び読みの情報が格納されてい
る接尾語辞書を設け、階層辞書中の単語列と一致する単
語列の直後の文章情報の表記と一致する表記の情報を接
尾語辞書から検索し、直後の表記に一致する接尾語辞書
の表記の読みを、直後の表記の読みとして選択すること
により、この単語列に接続されることによって一般の単
語の読みと異なる読みになる接尾語を正しい読みで読み
上げる。Further, the speech synthesizer of the present invention is provided with a suffix dictionary in which information of notation and reading of suffixes connected to the word string is stored, and the word string matching with the word string in the hierarchical dictionary is provided. By searching the suffix dictionary for information of the notation that matches the notation of the immediately following sentence information and selecting the notation of the notation of the suffix dictionary that matches the notation of immediately after as the reading of the notation immediately after, this word string Read the suffix with the correct reading, which becomes different from the reading of a general word when connected.

【００１１】[0011]

【発明の実施の形態】図２は本発明の音声合成装置の一
例を示す模式図である。この例の音声合成装置は、音声
出力機能を有する汎用のパーソナルコンピュータのディ
スクドライブに、以下に述べるような、住所辞書，住所
接尾語辞書等の辞書及び音声合成方法のコンピュータプ
ログラムが記録されている光磁気ディスク，ＣＤ−ＲＯ
Ｍ等の記録媒体Ｄを装填してコンピュータプログラムを
ローディングし、文章データベースのディスクから又は
公衆回線（図示省略）を介して入力された文章情報から
音声を合成し、コンピュータ本体に接続されているスピ
ーカから、又は公衆回線を介して合成音声を出力する構
成である。DESCRIPTION OF THE PREFERRED EMBODIMENTS FIG. 2 is a schematic diagram showing an example of a speech synthesizer of the present invention. In the voice synthesizer of this example, a disc drive of a general-purpose personal computer having a voice output function stores a dictionary such as an address dictionary and an address suffix dictionary, and a computer program of a voice synthesizing method as described below. Magneto-optical disk, CD-RO
A recording medium D such as M is loaded, a computer program is loaded, voice is synthesized from text information input from a disk of a text database or via a public line (not shown), and a speaker connected to the computer main body From or through a public line.

【００１２】なお、本発明の音声合成装置は上述のよう
に汎用のパーソナルコンピュータにソフトウェアをロー
ディングする構成以外に、ＦＭ多重放送から受信した交
通情報等の文字列情報を合成音声で読み上げるような音
声合成専用機であってもよく、その場合、文章情報はア
ンテナを介して入力される。The voice synthesizer according to the present invention is not limited to the configuration in which the software is loaded on the general-purpose personal computer as described above, but also a voice for reading the character string information such as traffic information received from the FM multiplex broadcast by the synthesized voice. It may be a dedicated synthesizer, in which case the text information is input via the antenna.

【００１３】〔実施の形態１〕図３は本発明の音声合成
装置の実施の形態１の構成図である。文章入力部101 は
ＣＤ−ＲＯＭ，光磁気ディスク等から文章情報を入力
し、住所辞書検索部102 は、文章情報に含まれる住所の
単語列に一致する単語列を、住所の単語列の表記及び読
みの情報が階層化され、この階層構造の情報とともに階
層構造の情報が格納されている階層構造の住所辞書103
から検索する。文章解析部104 は、住所辞書103 から検
索された単語の読みを住所辞書103 の読みとし、それ以
外の一般の単語に一致する単語を基本辞書105 から検索
して文章情報を読みに変換し、音声波形生成部106 は読
み及びアクセント情報等から音声波形を生成し、スピー
カ107 から文章情報を読み上げる合成音声が出力され
る。[First Embodiment] FIG. 3 is a block diagram of a first embodiment of a speech synthesizer of the present invention. The text input unit 101 inputs text information from a CD-ROM, a magneto-optical disk, or the like, and the address dictionary search unit 102 writes a word string that matches the word string of the address included in the text information and displays the word string of the address and Hierarchical address dictionary 103 in which reading information is hierarchized and hierarchical information is stored together with the hierarchical information.
Search from. The sentence analysis unit 104 sets the reading of the word retrieved from the address dictionary 103 as the reading of the address dictionary 103, searches the basic dictionary 105 for words that match other general words, and converts the sentence information into the reading. The voice waveform generation unit 106 generates a voice waveform from the reading and accent information and outputs a synthetic voice from the speaker 107, which reads the text information.

【００１４】図４は、住所辞書103 の一例の概念図であ
る。住所辞書103 には、住所の単語列を構成する各単語
の表記及び読みの情報が階層化されて格納されている。
階層の先頭は４７都道府県名であり、それ以下の階層と
して、市区町又は郡の後に町又は村の地名、さらに大
字、小字などの地名がその接続順に階層化されている。
各々の単語に関してはこのような階層情報の他に、検索
キーとなる漢字表記、アクセント句境界情報、アクセン
ト型等の韻律情報を含んだ発音情報、即ち読みの情報を
持っている。FIG. 4 is a conceptual diagram of an example of the address dictionary 103. The address dictionary 103 stores notation and reading information of each word constituting the word string of the address in a hierarchical manner.
The top of the hierarchy is the 47 prefecture names, and as the hierarchy below it, the place name of the town or village, and the place names such as large letters and small letters are hierarchized in the order of connection.
For each word, in addition to such hierarchical information, pronunciation information that includes prosodic information such as Kanji notation, accent phrase boundary information, and accent type that serves as a search key, that is, reading information is included.

【００１５】図５は実施の形態１における住所辞書103
の基本的な検索アルゴリズムのフローチャートである。
まず、入力された文章情報を格納するテキストバッファ
の先頭にテキストポインタを設定し（Ｓ101 ）、住所辞
書検索ポインタを住所辞書103 の階層構造の先頭に設定
する（Ｓ102 ）。テキストポインタを一文字ずつずらし
ながらそのテキストポインタ位置で始まる単語を住所辞
書103 中の検索候補の単語と比較して一致するか否かを
判定し（Ｓ103 ，104 ）、住所辞書103 中の単語がテキ
ストバッファ内に存在するか否かを調べていく。住所辞
書103 に、テキストポインタで始まる単語が存在してい
る場合は、住所辞書検索ポインタを次の階層に設定する
とともに（Ｓ105 ）、テキストポインタを次の単語位置
に設定し（Ｓ106 ）、テキストの次の単語位置に次の階
層の単語が存在するかどうかを調べていく（Ｓ107，Ｓ1
08 ）。FIG. 5 shows the address dictionary 103 according to the first embodiment.
3 is a flowchart of a basic search algorithm of.
First, a text pointer is set at the beginning of the text buffer that stores the input sentence information (S101), and the address dictionary search pointer is set at the beginning of the hierarchical structure of the address dictionary 103 (S102). While shifting the text pointer one character at a time, the word starting at the position of the text pointer is compared with the search candidate word in the address dictionary 103 to determine whether they match (S103, 104), and the word in the address dictionary 103 becomes a text. Check if it exists in the buffer. When the word starting with the text pointer exists in the address dictionary 103, the address dictionary search pointer is set to the next layer (S105) and the text pointer is set to the next word position (S106). It is checked whether or not there is a word in the next layer at the next word position (S107, S1).
08).

【００１６】住所辞書103 中の単語に一致する単語が存
在しなくなったら、住所辞書103 中の単語に一致した区
間を住所区間とみなし、一致した単語列に設定されてい
る読みの列を、その住所区間の読みとして発音情報を設
定し（Ｓ109 ）、文章解析部104 にわたす。また、住所
区間以降の文章にも住所が含まれている可能性があるた
め、住所辞書検索ポインタを階層構造の先頭に設定し
（Ｓ110 ）、同様の処理をテキスト情報の最後まで文章
全体に対して行う（Ｓ111 ，Ｓ112 ）。When there is no word matching the word in the address dictionary 103, the section matching the word in the address dictionary 103 is regarded as the address section, and the reading string set in the matching word string is Pronunciation information is set as the reading of the address section (S109) and passed to the sentence analysis unit 104. In addition, since the sentence after the address section may include the address, the address dictionary search pointer is set at the beginning of the hierarchical structure (S110), and the same process is performed on the entire sentence until the end of the text information. (S111, S112).

【００１７】〔実施の形態２〕図６は実施の形態２の構
成図である。なお、上述の実施の形態１と同一部分には
同一符号を付してその説明を省略する。実施の形態２で
は、文章解析部104 で使用する基本辞書105 中には単語
が存在しないが、住所辞書103 の検索によって発音情報
が既知である場合に、単語登録しないで、入力文章中に
発音情報を発音指定文字列としてテキストに埋め込む住
所発音設定部108 と、テキストに埋め込まれた発音指定
文字列を識別して発音指定と解析する発音指定解析部10
9 とが設けられている。[Second Embodiment] FIG. 6 is a block diagram of the second embodiment. The same parts as those in the first embodiment described above are designated by the same reference numerals and the description thereof is omitted. In the second embodiment, no word exists in the basic dictionary 105 used by the sentence analysis unit 104, but if the pronunciation information is known by the search of the address dictionary 103, the word is not registered and the pronunciation is input in the input sentence. An address pronunciation setting unit 108 that embeds information as a pronunciation designation character string in a text and a pronunciation designation analysis unit 10 that identifies and analyzes the pronunciation designation character string embedded in the text.
9 and are provided.

【００１８】ここで、発音指定文字列のフォーマットを
「〈発音：漢字表記：発音情報〉」と定義した場合、
「〈：〉」は、文章中で特殊な意味を持たせるための記
号であり、「発音」という文字列は発音指定を識別する
ためのキーワードである。「漢字表記」は単なるコメン
トの役割で、「発音情報」にはカタカナ及びアクセント
記号で表現された発音情報を記述する。Here, when the format of the pronunciation designation character string is defined as "<pronunciation: Kanji notation: pronunciation information>",
"<:>" Is a symbol for giving a special meaning in a sentence, and the character string "pronunciation" is a keyword for identifying pronunciation designation. "Kanji notation" is merely a role of comment, and "pronunciation information" describes pronunciation information expressed by katakana and accent marks.

【００１９】次に、動作について説明する。文章入力部
101 から、例えば「東京都大田区に住んでいます。」と
いう文章が入力されると、住所辞書検索部102 は、実施
の形態１と同様に住所辞書103 を検索し、「東京都大田
区」の区間が住所単語列であり、その住所の発音が「ト
ーキョ′ートオータ′ク」（「′」はアクセント記
号、「」はアクセント区境界記号）であることを判定
する。さらに、住所発音設定部108 では、入力された文
章の住所の区間を前述の発音指定文字列に置換し、
「〈発音：東京都大田区：トーキョ′ートオータ′
ク〉に住んでいます。」という文字列を発音指定解析部
109 に出力する。Next, the operation will be described. Text input section
For example, when the sentence "I live in Ota-ku, Tokyo" is input from 101, the address dictionary search unit 102 searches the address dictionary 103 as in the first embodiment, and then "Ota-ku, Tokyo" is entered. Is the address word string, and the pronunciation of that address is "Tokyo" Otaku '("'" is the accent symbol, " Is an accent ward boundary symbol). Further, in the address pronunciation setting unit 108, the section of the address of the input sentence is replaced with the pronunciation designation character string,
"Pronunciation: Ota-ku, Tokyo: Tokyo ' Ota '
I live in The pronunciation designation analysis part
Output to 109.

【００２０】発音指定解析部109 は、括弧記号
（〈〉）で区切られた文字を識別して発音指定として
解析し、その部分には、発音が「トーキョ′ートオー
タ′ク」である名詞が存在しているとする一方、その他
の部分の文章はそのまま文章解析部104 にわたす。文章
解析部104 は発音指定の情報とその他の文章とを解析し
て、正しい読み情報を設定する。The pronunciation designation analysis unit 109 identifies the characters delimited by parentheses (<>) and analyzes them as pronunciation designations, and in that portion, the pronunciation is "Tokyo". It is assumed that there is a noun "Otaku", while the other parts of the sentence are passed to the sentence analysis unit 104 as they are. The sentence analysis unit 104 analyzes the pronunciation specifying information and other sentences, and sets correct reading information.

【００２１】実施の形態２では、住所辞書検索部102 に
住所発音設定部108 を接続し、住所発音設定部108 と文
章解析部104 との間に、住所発音設定部108 からの入力
経路の他に、文章入力部101 からの文章情報の入力経路
を有する発音指定解析部109を接続することにより、住
所を含まない文章を読み上げる場合に、基本辞書105を
参照して発音情報に変換すべく文章情報を文章解析部10
4 に直接的に入力できる。即ち、図中、破線で囲んだ住
所読み上げ部を独立した装置として構成したり、また文
章情報の入力経路を選択的に使用することができる。In the second embodiment, an address pronunciation setting unit 108 is connected to the address dictionary search unit 102, and an input path from the address pronunciation setting unit 108 is provided between the address pronunciation setting unit 108 and the sentence analysis unit 104. In addition, by connecting the pronunciation designation analysis unit 109 having the input path of the sentence information from the sentence input unit 101, when reading a sentence that does not include an address, the sentence to be converted into the pronunciation information by referring to the basic dictionary 105. Information analysis part 10
You can type directly in 4. That is, in the figure, the address reading part surrounded by a broken line can be configured as an independent device, and the input path of the text information can be selectively used.

【００２２】従って、第１に、住所読み上げ部を独立の
装置として構成した場合には並列処理が可能になる。住
所辞書103 の単語数は一般的に十万単語を超え、言語処
理部の基本辞書105 の単語数は数万単語から十万単語を
超える場合もあるので、検索処理の負荷が大きいが、こ
のような構成にすれば、２個のＣＰＵで住所読み上げ部
分の処理とその他の部分の音声合成のための言語処理と
を並列処理することが可能になるため、処理時間の増加
を防ぐことが可能である。また、第２に、住所読み上げ
部をソフトウェアで構成した場合には、住所読み部と言
語処理部とのコマンドとして独立のコマンドを作成する
ことができるので、ソフトウェアの保守作業、システム
変更への対応が容易である。Therefore, first, when the address reading unit is constructed as an independent device, parallel processing becomes possible. The number of words in the address dictionary 103 generally exceeds 100,000 words, and the number of words in the basic dictionary 105 of the language processing unit may exceed tens of thousands words to 100,000 words. With such a configuration, the processing of the address reading part and the language processing for speech synthesis of the other parts can be processed in parallel by the two CPUs, so that an increase in processing time can be prevented. Is. Secondly, when the address reading unit is composed of software, independent commands can be created as commands for the address reading unit and the language processing unit, so that software maintenance work and system changes can be handled. Is easy.

【００２３】〔実施の形態３〕図７(a) は実施の形態３
の構成図である。なお、上述の実施の形態と同一部分に
は同一符号を付してその説明を省略する。この実施の形
態では、文章情報の入力の都度、地域名を設定する指定
地域入力部110 と、指定された地域を階層構造の住所辞
書103 の検索開始地点として保持しておく検索開始位置
格納バッファ111 とが設けられており、地域名が指定さ
れている場合は、指定されている地域に属する階層構造
の各階層の検索から開始することによって、検索を行う
ために必要な処理時間を大幅に削減するものである。[Third Embodiment] FIG. 7A shows a third embodiment.
FIG. The same parts as those in the above-described embodiment are designated by the same reference numerals and the description thereof will be omitted. In this embodiment, a designated area input unit 110 that sets the area name each time text information is input, and a search start position storage buffer that holds the designated area as the search start point of the hierarchical address dictionary 103. If 111 and are specified and the area name is specified, the processing time required to perform the search will be significantly increased by starting the search for each hierarchy in the hierarchical structure belonging to the specified area. To reduce.

【００２４】住所は４７都道府県名から全部表記する場
合もあるが、文脈などから都道府県名が自明であった
り、よく知られている地名であるために都道府県名が省
略できる地名であったりした場合は都道府県名を省略し
て表記することが多い。しかし、文章中に階層の途中か
ら始まる住所が含まれている場合、階層化された住所辞
書103 は途中の階層からの検索も可能であるが、階層が
下がるにつれて検索対象の単語数は増え、検索対象の単
語が数十万以上にも及ぶ可能性がある。そのため、この
実施の形態では、階層上のどの地点を起点にして検索を
行うかという情報を地域で指定する。The address may be written from the 47 prefectures in all, but the prefecture name is obvious from the context, or the prefecture name may be omitted because it is a well-known place name. In many cases, the prefecture name is omitted. However, if the sentence contains an address that starts in the middle of the hierarchy, the hierarchical address dictionary 103 can search from the middle hierarchy, but the number of search target words increases as the hierarchy decreases, There are hundreds of thousands of words to search for. Therefore, in this embodiment, information indicating which point on the hierarchy is to be used as the starting point for the search is specified by region.

【００２５】図８は、例えば、指定地域入力部110 より
「北海道旭川市」という地域が指定された場合の検索開
始位置格納バッファ111 の概念図である。指定地域入力
部110 より「北海道旭川市」という地域が指定された場
合、指定地域が住所辞書検索部102 により検索され、
「階層先頭」及び「北海道」及び「旭川市」という３つ
のそれぞれの階層構造上の検索ポインタが検索開始位置
格納バッファ111 に格納される。住所辞書検索部102
は、検索開始位置格納バッファ111 中の各々の検索ポイ
ンタを起点として住所辞書103 を検索する。なお、検索
開始位置の情報は１地域に限らず、複数地域の情報を格
納しておく構成であってもよい。FIG. 8 is a conceptual diagram of the search start position storage buffer 111 when the area "Asahikawa, Hokkaido" is specified by the specified area input unit 110, for example. When the area "Hokkaido Asahikawa City" is specified by the specified area input unit 110, the specified area is searched by the address dictionary search unit 102,
The search start position storage buffer 111 stores the search pointers in each of the three hierarchical structures of “top of hierarchy”, “Hokkaido”, and “Asahikawa city”. Address dictionary search unit 102
Searches the address dictionary 103 starting from each search pointer in the search start position storage buffer 111. Note that the information on the search start position is not limited to one area, and information on a plurality of areas may be stored.

【００２６】図７(b) は実施の形態３の変形例の構成図
である。本変形例が実施の形態３と異なる点は、指定地
域入力部110 に替えて、入力された文章情報中から地域
の情報を獲得する指定地域獲得部112 が設けられている
点である。指定地域獲得部112 は、住所辞書検索部102
で住所辞書103 が検索され、文章中の単語に一致する住
所辞書103 中の単語列候補が存在した場合、文章中の地
名の地域を獲得して検索開始位置格納バッファ111 に格
納し、検索開始位置の情報は上述と同様に利用される。FIG. 7B is a block diagram of a modification of the third embodiment. This modification is different from the third embodiment in that a designated area input unit 110 is replaced with a designated area acquisition unit 112 that obtains area information from the input text information. The designated area acquisition unit 112 is the address dictionary search unit 102.
When the address dictionary 103 is searched with, and there is a word string candidate in the address dictionary 103 that matches a word in the sentence, the area of the place name in the sentence is acquired and stored in the search start position storage buffer 111, and the search is started. The position information is used as described above.

【００２７】〔実施の形態４〕図９は本発明の音声合成
装置の実施の形態４の構成図である。なお、上述の実施
の形態と同一部分には同一符号を付してその説明を省略
する。実施の形態４では、途中の階層から始まる地名を
検索すべく、住所辞書の途中から以下の各階層を全て検
索する方法をとった場合、検索に長時間を要するという
不具合を解消するために階層構造の住所辞書103 を図10
のような構成にするとともに、単語間の接続関係を求め
るための接続関係設定部113 が設けられている。[Fourth Embodiment] FIG. 9 is a block diagram of a fourth embodiment of the speech synthesizer of the present invention. The same parts as those in the above-described embodiment are designated by the same reference numerals and the description thereof will be omitted. In the fourth embodiment, when a method of searching all the following layers from the middle of the address dictionary is used to search for a place name starting from the middle layer, the hierarchy that the search takes a long time is solved. Figure 10 of the structure address dictionary 103
In addition to the above configuration, a connection relation setting unit 113 for determining the connection relation between words is provided.

【００２８】即ち、図10の住所辞書103 では、階層構造
の全ての単語が、表記及び読みの情報に、読みがこの読
みになる、上位の階層の親単語を特定する親番号のよう
な親情報を付与して親子関係で表現されている（例：親
単語♯１，親単語♯２，…）。さらに、表記による検索
の簡単のために表記のコード順にソーティングされてい
る。このとき、一つの表記に異なる読みを持つ単語は複
数の読みのそれぞれに親情報を持たせ、また異なる表記
で同じ読みを持つ単語は各読みに複数の親情報を持たせ
ることで情報量を圧縮することもできる。That is, in the address dictionary 103 of FIG. 10, all the words in the hierarchical structure are included in the notation and reading information as parent numbers such as parent numbers that specify the parent word in the upper hierarchy, where the reading is this reading. Information is given and expressed in a parent-child relationship (eg, parent word # 1, parent word # 2, ...). Furthermore, in order to facilitate the search by the notation, they are sorted in the code order of the notation. At this time, words having different readings in one notation have parent information for each of the plural readings, and words having different readings have the same reading, so that each reading has multiple parent information. It can also be compressed.

【００２９】図11は接続関係設定部113 のアルゴリズム
のフローチャートであって、破線で囲んだステップは接
続関係設定部113 での処理を示している。検索文字位置
を文章情報の最初の文字に設定し（Ｓ201 ）、開始位置
で始まる単語を検索する（Ｓ202 ）。開始位置で始まる
単語が存在するか否かを判定し（Ｓ203 ）、開始位置で
始まる単語が存在しない場合は検索文字位置を次の文字
に設定し（Ｓ204 ）、テキスト中の最後の文字か否かを
判定する（Ｓ205 ）。テキスト中の最後の文字でない場
合はステップＳ202 に移行し、開始位置で始まる単語を
検索する（Ｓ202 ）。住所辞書検索部102 は文章に含ま
れる全ての単語に基づいて、住所辞書103 の全階層を検
索して文章中の単語に一致する全候補を抽出する。FIG. 11 is a flowchart of the algorithm of the connection relation setting unit 113, and the steps surrounded by broken lines show the processing in the connection relation setting unit 113. The search character position is set to the first character of the sentence information (S201), and the word starting at the start position is searched (S202). It is determined whether or not there is a word that starts at the start position (S203). If there is no word that starts at the start position, the search character position is set to the next character (S204), and it is determined whether it is the last character in the text. It is determined whether or not (S205). If it is not the last character in the text, the process proceeds to step S202, and the word starting at the start position is searched (S202). The address dictionary search unit 102 searches all layers of the address dictionary 103 based on all the words included in the sentence, and extracts all candidates that match the words in the sentence.

【００３０】ステップＳ203 での判定の結果、開始位置
で始まる単語が存在する場合、異なる読み、単語を持つ
単語を分割する（Ｓ206 ）。接続関係設定部113 は、全
ての単語に関する処理を行ったか否かを判定し（Ｓ207
）、全ての単語に関して処理を行っていない場合、開
始位置で終わる単語が存在するか否かを判定し（Ｓ208
）、存在しない場合は親単語無しの情報を設定する
（Ｓ209 ）。一方、ステップＳ208 の判定の結果、開始
位置で終わる単語が存在する場合は親番号が一致する単
語が存在するか否かを判定する（Ｓ210 ）。親番号が一
致する単語がない場合は親単語無しの情報を設定する一
方（Ｓ209 ）、親番号が一致する単語が存在する場合は
親単語へのポインタを設定する（Ｓ211 ）。If the result of determination in step S203 is that there is a word starting at the start position, words with different readings and words are divided (S206). The connection relation setting unit 113 determines whether or not the processing has been performed for all words (S207).
), If all words have not been processed, it is determined whether there is a word ending at the start position (S208).
), If it does not exist, information without parent word is set (S209). On the other hand, if the result of determination in step S208 is that there is a word ending at the start position, it is determined whether or not there is a word with a matching parent number (S210). When there is no word having the same parent number, the information without parent word is set (S209), and when there is the word having the same parent number, a pointer to the parent word is set (S211).

【００３１】ステップＳ207 の判定の結果、全ての単語
に関して処理を行った場合はステップＳ204 に移行して
検索文字位置を次の文字に設定し、テキスト中の最後の
文字になるまでステップＳ204 〜Ｓ211 を繰り返す。If the result of determination in step S207 is that all words have been processed, step S204 follows and the search character position is set to the next character, and steps S204 to S211 are set until the last character in the text is reached. repeat.

【００３２】この実施の形態では、親単語が存在する場
合には必ずその単語の持つ読みを選択する。住所辞書10
3 をこのような構成にすることにより、階層構造のどの
階層から始まっている住所であっても正しい読みが得ら
れる。また、住所辞書として、住所の各階層を表す単語
にハッシュインデックスを付与し、ハッシュインデック
スを介して読みを検索する構成であってもよい。住所辞
書をこのような構造にした場合でも、住所辞書の検索に
より表記に一致する候補を住所辞書から抽出した後で接
続関係を求める手順は図10の構成の場合と同様である。In this embodiment, when the parent word exists, the reading of that word is always selected. Address dictionary 10
By making 3 such a structure, the correct reading can be obtained even if the address starts from any hierarchy of the hierarchical structure. Further, as the address dictionary, a configuration may be adopted in which a hash index is given to the word representing each layer of the address and the reading is searched through the hash index. Even when the address dictionary has such a structure, the procedure for obtaining the connection relation after extracting the candidates matching the notation from the address dictionary by searching the address dictionary is the same as in the case of the configuration of FIG.

【００３３】また、文章中の単語列に一致する単語列を
住所辞書103 から求めたときに複数通りのマッチングが
発生する場合がある。例えば、《東京都−港区−白金》
と《東京都−港区−白金台》という地名が含まれている
住所辞書103 を用いて『東京都港区白金台は、…。』と
いう文章を音声合成する場合、又は《山形県−南陽市−
宮内−新町》と《熊本県−荒尾市−宮内─新町》という
地名が含まれている住所辞書103 を用いて『荒尾市宮内
新町は、…。』という文章を音声合成する場合に住所辞
書103 において両方の地名にマッチングする。このと
き、先の例では《東京都−港区−白金》の全文字長は７
文字であり、《東京都−港区−白金台》の全文字長は８
文字であるので、文字長が長い方の読みを選択する。ま
た後の例で、山形県の地名では《宮内−新町》と４文字
しか一致しないが、熊本県の地名では《荒尾市−宮内−
新町》と７文字一致するので、文字長が長い方の熊本県
の読みを選択する。In addition, when a word string that matches a word string in a sentence is obtained from the address dictionary 103, a plurality of types of matching may occur. For example, Tokyo-Minato-Platinum
And using the address dictionary 103 that includes the place name “Tokyo-Minato-ku-Shirokanedai”, “Shirokanedai, Minato-ku, Tokyo is ... When synthesizing the sentence "," or "Yamagata Prefecture-Nanyo City-
Using the address dictionary 103 that includes the place names "Miyauchi-Shinmachi" and "Kumamoto Prefecture-Arao City-Miyauchi-Shinmachi", "Miyauchi Shinmachi in Arao City ... When the sentence “” is speech-synthesized, both place names are matched in the address dictionary 103. At this time, in the previous example, the total character length of << Tokyo-Minato-ku-Platinum >> is 7
It is a character, and the total character length of "Tokyo-Minato-ku-Shirokanedai" is 8
Since it is a character, the reading with the longer character length is selected. Also, in the example below, the place name in Yamagata Prefecture matches only 4 characters with "Miyauchi-Shinmachi", but the place name in Kumamoto Prefecture has "Arao City-Miyauchi-"
Since it matches 7 characters with "Shinmachi", select the Kumamoto reading with the longer character length.

【００３４】以上のアルゴリズムのフローチャートを図
12に示す。全単語から単語候補列を作成し（Ｓ301 ）、
一番長い文字数の単語列を選択する（Ｓ302 ）。選択し
た単語と、この単語と区間が重複する候補列を、単語候
補列を格納しているバッファ（図示せず）から削除する
（Ｓ303 ）。単語候補列のバッファが空か否かを判定し
（Ｓ304 ）、バッファが空になるまでステップＳ302 、
Ｓ303 を繰り返す。このアルゴリズムでは文章中に複数
個の住所が含まれている場合も考慮した処理を行う。The flow chart of the above algorithm is illustrated.
See Figure 12. A word candidate string is created from all the words (S301),
The word string having the longest number of characters is selected (S302). The selected word and the candidate string whose section overlaps with this word are deleted from the buffer (not shown) storing the word candidate string (S303). It is determined whether or not the buffer of the word candidate string is empty (S304), and step S302 is performed until the buffer becomes empty.
Repeat S303. In this algorithm, processing is performed in consideration of the case where a sentence contains a plurality of addresses.

【００３５】〔実施の形態５〕図13(a) 及び図13(b) は
本発明の音声合成装置の実施の形態５の構成図であっ
て、図13(a) は階層構造をそのまま持つ図４の構成の住
所辞書103 を使用する場合の構成図、図13(b) は階層情
報を単語間の親子関係で表現した図10の構成の住所辞書
103 を使用する場合の構成図である。なお、図中、上述
の実施の形態と同一部分には同一符号を付してその説明
を省略する。この実施の形態では、階層省略情報設定部
114 を設けた点が異なる。[Fifth Embodiment] FIGS. 13 (a) and 13 (b) are block diagrams of the fifth embodiment of the speech synthesizer of the present invention. FIG. 13 (a) has a hierarchical structure as it is. FIG. 13 (b) is a block diagram when the address dictionary 103 having the structure of FIG. 4 is used, and FIG. 13 (b) is an address dictionary having the structure of FIG. 10 in which hierarchical information is expressed by parent-child relationships between words.
It is a block diagram when using 103. In the figure, the same parts as those in the above-described embodiment are designated by the same reference numerals and the description thereof will be omitted. In this embodiment, the layer omission information setting unit
The difference is that 114 is provided.

【００３６】即ち、地名を表す場合に省略される部分と
して、都道府県名から始まる先頭部分だけではなく、住
所の途中の階層が省略される場合もある。例えば、正確
には『山梨県西八代郡上九一色村』であるが、『山梨県
上九一色村』と郡の名称が省略されている場合、また正
確には『神奈川県横浜市緑区長津田』であるが、『神奈
川県横浜市長津田』と区の名称が省略されているような
場合がある。このような住所辞書の階層構造の一階層又
は何階層かが省略された住所表記が文章中に存在する場
合には階層構造を持った住所辞書をそのまま参照して検
索しても一致する候補が探し出せず、住所を正しく読み
上げることはできない。That is, as a part omitted when representing a place name, not only the top part starting from the prefecture name but also the middle level of the address may be omitted. For example, if it is exactly "Kamikichishokumura, Nishiyatsushiro-gun, Yamanashi Prefecture", but the name "Kamikichishokumura, Yamanashi Prefecture" is omitted, or more accurately, it is "Midori-ku, Yokohama-shi, Kanagawa Prefecture." Although it is Nagatsuda, the name of the ward may be omitted, such as "Nagatsuda, Yokohama-shi, Kanagawa". When there is an address notation in which one or several layers of the hierarchical structure of the address dictionary are omitted in the sentence, there is no matching candidate even if the address dictionary having the hierarchical structure is directly referred to for a search. I can't find it and can't read the address correctly.

【００３７】従って、実施の形態５では、単語検索時又
は接続関係設定時に一階層飛ばした組み合わせも可能で
あるという規則を設ける。図13(a) に関しては、階層毎
にポインタをずらしながら単語を検索していく方法を基
に説明する。この場合、特定の階層又は全ての階層での
単語を検索する際に、検索ポインタで示される階層の単
語を検索するだけでなく、それらの単語及びそれらの単
語の一階層又は何階層か下の全ての単語を検索する。階
層省略情報設定部114 は、このような検索の階層省略の
情報を設定する手段である。Therefore, in the fifth embodiment, there is provided a rule that a combination skipped by one layer is possible when searching for a word or when setting a connection relation. With respect to FIG. 13 (a), description will be given based on a method of searching for words while shifting the pointer for each layer. In this case, when searching for words in a specific layer or all layers, not only the word in the layer indicated by the search pointer is searched, but also those words and one or several layers below those words. Search for all words. The layer omission information setting unit 114 is means for setting information on such layer omission in the search.

【００３８】図13(b) に関しては、文章中に表れる全階
層中の全単語を検索し、その後に、辞書引きされた単語
間の接続関係を接続関係設定部113 で求める方法を基に
説明する。この場合、一階層又は何階層か省略された場
合でも単語間の親子関係が存在するという関係を接続関
係設定部113 で設定する必要があるが、そのための制御
を階層省略情報設定部114 で行う。即ち、親情報のポイ
ンタを二回たどることで、一階層省略された場合でも接
続関係があるということを簡単に判定することができ
る。With reference to FIG. 13 (b), a description will be given based on a method of searching all the words in all the layers appearing in the sentence and then obtaining the connection relation between the words in the dictionary by the connection relation setting unit 113. To do. In this case, the connection relationship setting unit 113 needs to set the relationship that a parent-child relationship between words exists even if one layer or several layers are omitted, and the layer omission information setting unit 114 performs control therefor. . That is, by tracing the pointer of the parent information twice, it is possible to easily determine that there is a connection even if one layer is omitted.

【００３９】また、上記問題点を解決する他の方法とし
て、単語検索時又は接続関係設定時に、郡の名称，区の
名称等の省略される可能性のある単語に関して、省略さ
れる可能性があるという情報を、住所辞書103 の単語の
属性として持つ方法が考えられる。図14(a) 、図14(b)
は実施の形態５の変形例の構成図であって、図14(a) は
階層構造をそのまま持つ図４の構成の住所辞書103 を使
用する場合の構成図、図14(b) は階層情報を単語間の親
子関係で表現した図10の構成の住所辞書103 を使用する
場合の構成図である。As another method for solving the above-mentioned problems, words that may be omitted, such as a county name and a ward name, may be omitted at the time of word search or connection setting. A method in which the information that there is is used as an attribute of a word in the address dictionary 103 can be considered. Figure 14 (a) and Figure 14 (b)
FIG. 14 is a configuration diagram of a modification of the fifth embodiment, FIG. 14 (a) is a configuration diagram when the address dictionary 103 having the configuration of FIG. 4 having the hierarchical structure is used, and FIG. 14 (b) is hierarchy information. FIG. 11 is a configuration diagram in the case of using the address dictionary 103 having the configuration of FIG. 10 in which is expressed by a parent-child relationship between words.

【００４０】図14(a) に関しては、階層毎にポインタを
ずらしながら単語を検索していく方法を基に説明する。
この場合、階層省略情報獲得部115 が、住所辞書103 中
の単語の属性を調べ、省略される可能性があるという情
報が含まれている場合のみ、上述と同様に、それらの単
語及びそれらの単語の一階層又は何階層か下の全ての単
語を検索する。この方法では、一階層又は何階層か省略
された場合に、省略される可能性のある部分だけを検索
するので、処理量が増加しないという利点がある。With reference to FIG. 14 (a), description will be made on the basis of a method of searching for words while shifting the pointer for each layer.
In this case, only when the hierarchy omission information acquisition unit 115 examines the attribute of a word in the address dictionary 103 and includes information that the word may be omitted, similar to the above, those words and their Search for all words in one or several levels below the word. This method has an advantage that the amount of processing does not increase because only the part that may be omitted is searched when one or several layers are omitted.

【００４１】図14(b) に関しては、文章中に表れる全階
層中の全単語を検索し、その後に、辞書引きされた単語
間の接続関係を接続関係設定部113 で求める方法を基に
して説明する。この場合、階層省略情報獲得部115 が、
住所辞書103 中の単語の属性を調べて、省略される可能
性があるという情報が含まれている場合のみ、親情報の
ポインタを二回たどることで、一階層省略された記載さ
れている場合でも接続関係があることを判定することが
できる。また、親単語が省略される可能性がある場合に
は、親単語の親単語を親情報として持つというように、
複数個の親情報を持つ構成としてもよい。With respect to FIG. 14 (b), based on the method of searching all the words in all the layers appearing in the sentence and then obtaining the connection relation between the words in the dictionary by the connection relation setting unit 113. explain. In this case, the layer omission information acquisition unit 115
When the attribute of a word in the address dictionary 103 is checked, and only if it contains information that may be omitted, by tracing the pointer of the parent information twice, it is described as one level omitted However, it can be determined that there is a connection relationship. If the parent word may be omitted, it has the parent word of the parent word as parent information.
It may be configured to have a plurality of pieces of parent information.

【００４２】〔実施の形態６〕図15は本発明の音声合成
装置の実施の形態６の構成図であって、上述の実施の形
態と同一部分には同一符号を付してその説明を省略す
る。この実施の形態では、全ての可能な単語列候補を作
成した後で、その中から一番確からしい候補を選択する
単語列候補選択部116 と、文章中に一致した住所辞書10
3 中の単語列候補に対して住所の読みを選択するか否か
を判定する住所読み判定部117 とが設けられている。[Sixth Embodiment] FIG. 15 is a block diagram of a sixth embodiment of the speech synthesizer of the present invention. The same parts as those in the above-mentioned embodiment are designated by the same reference numerals and the description thereof will be omitted. To do. In this embodiment, after creating all possible word string candidates, the word string candidate selecting section 116 that selects the most probable candidate from among the word string candidates and the address dictionary 10 that matches in the sentence.
An address reading determination unit 117 that determines whether or not to select the address reading for the word string candidates in 3 is provided.

【００４３】即ち、住所辞書にマッチングする際に必ず
住所辞書を参照すると、以下のような、住所以外の部分
を住所の読みに置き換えてしまうという不具合がある。
例えば、『化石（バケ′イシ）』、『三角（ミカド、
ミ′スミ）』、『山寺（ヤ′マジ）』、『小文字（コモ
ンジ）』、『大文字（ダ′イモンジ）』等のように、普
通名詞と異なる読みを持つ地名が存在する場合がある。
また、たとえ読みが同じであっても、アクセント型又は
アクセント結合属性が違うために、読み上げるアクセン
トが違ってくる場合も発生する。従って、このような場
合には住所の読みで読み上げないようにする必要があ
る。この不具合を回避するため、実施の形態６では、所
定の単語数又は文字数より少ない単語数又は文字数しか
一致しない場合は住所の読みを選択しないようにする。That is, if the address dictionary is always referred to when matching with the address dictionary, there is a problem that the following parts other than the address are replaced with the reading of the address.
For example, "fossil (bake'ishi)", "triangle (Mikado,
There may be a place name with a pronunciation different from a common noun, such as "Mi'sumi", "Yamadera", "lowercase (common)", "uppercase (da'imoji)", etc.
Further, even if the readings are the same, the accent type or accent combination attribute may be different, so that the accent to be read may be different. Therefore, in such a case, it is necessary not to read aloud the address. In order to avoid this inconvenience, in the sixth embodiment, the address reading is not selected when the number of words or the number of characters is smaller than the predetermined number of words or the number of characters.

【００４４】例えば、《岡山県（オカヤマ′ケン）−上
房郡（ジョーボ′ーグン）−賀陽町（カヨーチョー）−
北（キ′タ）−門（カド）》、《大分県（オーイタ′ケ
ン）−東国東郡（ヒガシクニサキ′グン）−国見町（ク
ニミチョー）−中（ナ′カ）─下（シモ）》という地名
が住所辞書103 に登録されている場合、『北門で待つ』
『上中下』等の文章が入力された場合でも「北門」の２
文字、又は「中下」の２文字しかマッチングしない場合
には住所の読みを選択しないようにして基本辞書105 を
参照するようにすれば、正しく読み上げることができ
る。For example, "Okayama Prefecture (Ken) -Kamibo-gun-Kayocho-"
Place names such as "Kita"-"Kado""," Oita Prefecture "-Higashi-Kunisaki-gun-Kunimicho-Middle-Shimo" Is registered in the address dictionary 103, "Wait at the north gate"
Even if a sentence such as "Upper middle lower" is entered, 2 of "North Gate"
If only letters or two letters of "middle lower" are matched, if the reading of the address is not selected and the basic dictionary 105 is referred to, the correct reading can be performed.

【００４５】図16は実施の形態６のアルゴリズムのフロ
ーチャートである。候補単語列の単語数又は文字数を求
め（Ｓ401 ）、求めた単語数又は文字数がしきい値より
小さいか否かを判定する（Ｓ402 ）。候補単語列の単語
数又は文字数がしきい値以上の場合は住所の読みを選択
する（Ｓ403 ）。一方、求めた単語数又は文字数がしき
い値より小さい場合は住所の読みを選択せずに終了す
る。FIG. 16 is a flowchart of the algorithm of the sixth embodiment. The number of words or the number of characters in the candidate word string is obtained (S401), and it is determined whether the obtained number of words or the number of characters is smaller than a threshold value (S402). When the number of words or the number of characters of the candidate word string is equal to or larger than the threshold value, the reading of the address is selected (S403). On the other hand, if the calculated number of words or characters is smaller than the threshold value, the reading of the address is not selected and the process ends.

【００４６】〔実施の形態７〕図17は本発明の音声合成
装置の実施の形態７の構成図であって、上述の実施の形
態と同一部分には同一符号を付してその説明を省略す
る。実施の形態７では、自然な発声の合成音声を得るた
めに韻律境界記号を設定する韻律境界設定部118が設け
られている。[Seventh Embodiment] FIG. 17 is a block diagram of a seventh embodiment of the speech synthesizer of the present invention. The same parts as those in the above-mentioned embodiment are designated by the same reference numerals and their description is omitted. To do. In the seventh embodiment, a prosody boundary setting unit 118 for setting prosody boundary symbols is provided in order to obtain a synthetic speech with a natural utterance.

【００４７】即ち、全体のモーラ数の長い住所を読み上
げる場合、呼気段落境界及びフレーズ境界を設定しない
と、発声のピッチが低くなりすぎたり、息つぎが無い発
声で息苦しく聞こえたりする。例えば、『北海道札幌市
南区定山渓定山渓豊羽鉱山くるみ沢』という住所の読み
は、「ホッカ′イドーサッポロ′シミナミ′クジョ
ーザ′ンケージョーザンケートヨハコ′ーザンクル
ミ′サワ」であるが、一気に読むと非常に不自然に聞こ
える。また、区切りすぎでも不自然に聞こえる。そのた
めに、フレーズ境界又は呼気段落境界等の韻律境界記号
を適当な位置に設定する必要がある。That is, the address with a long total number of moras is read.
When setting, do not set exhalation paragraph boundary and phrase boundary
When the vocal pitch becomes too low or there is no breathing,
It sounds stuffy in your voice. For example, "Sapporo City, Hokkaido
Minami-ku Jozankei Jozankei Toyoha Mine Kurumizawa "
"Hocca 'Idoo Sapporo ’ Min ' Jo
User Joe Zankate Johanna ' Kuru
It's "Mi'sawa", but when you read it all at once, it sounds very unnatural.
I can. Also, it sounds unnatural even if it is separated too much. That
For prosodic boundary symbols such as phrase boundaries or expiratory paragraph boundaries
Need to be set to an appropriate position.

【００４８】単語間境界記号を設定する第１の方法とし
て、住所辞書103 とマッチングした単語のモーラ数を累
積していき、累積モーラ数がしきい値を超えないよう
に、又は超える毎にフレーズ境界又は呼気段落境界等の
韻律境界記号を設定する方法が考えられる。As a first method of setting the inter-word boundary symbol, the number of mora of words matched with the address dictionary 103 is accumulated, and the phrase is kept so that the accumulated number of mora does not exceed the threshold value or every time it exceeds the threshold value. A method of setting a prosodic boundary symbol such as a boundary or an expiratory paragraph boundary can be considered.

【００４９】図18はこのアルゴリズムのフローチャート
である。累積モーラ数に現単語モーラ数を設定する（Ｓ
501 ）。地名単語の読みをバッファ（図示せず）に設定
し（Ｓ502 ）、地名単語のポインタを次に進める（Ｓ50
3 ）。地名単語候補列が終了か否かを判定し（Ｓ504
）、終了でない場合は地名単語のモーラ数を加算する
（Ｓ505 ）。累積モーラ数がしきい値を超えたか否かを
判定し（Ｓ506 ）、しきい値を超えるまでステップＳ50
2 〜Ｓ505 を繰り返す。累積モーラ数がしきい値を超え
ると、呼気段落記号をバッファに設定し（Ｓ507 ）、ス
テップＳ501 に戻って地名単語候補列が終了するまで、
ステップＳ501〜Ｓ507 を繰り返す。FIG. 18 is a flowchart of this algorithm. Set the current word mora number to the cumulative mora number (S
501). The reading of the place name word is set in a buffer (not shown) (S502), and the place name word pointer is advanced to the next (S50).
3). It is determined whether or not the place name word candidate sequence ends (S504).
), If not finished, the number of mora of the place name word is added (S505). It is determined whether or not the cumulative number of moras exceeds the threshold value (S506), and step S50 is performed until the threshold value is exceeded.
Repeat steps 2 to S505. When the cumulative number of moras exceeds the threshold value, the breath paragraph mark is set in the buffer (S507), and the process returns to step S501 until the place name word candidate sequence ends.
Steps S501 to S507 are repeated.

【００５０】先の『北海道札幌市南区定山渓定山渓豊羽
鉱山くるみ沢』の例では、単語毎のモーラ数は｛６，
５，４，６，13, ５｝であるから、しきい値を13モーラ
に設定すれば、｛（６，５），（４，６），13, ５｝に
分割される。ここで、呼気段落記号を「・」で表した場
合、読みは「ホッカ′イドーサッポロ′シ・ミナミ′
クジョーザ′ンケー・ジョーザンケートヨハコ′ーザ
ン・クルミ′サワ」となり、自然に読み上げることがで
きる。In the above example of "Jyozankei, Jozankei, Jozankei, Toyoha Mine, Kurumizawa, Minami-ku, Sapporo, Hokkaido", the number of mora for each word is {6.
Since it is 5, 4, 6, 13, 5}, if the threshold value is set to 13 moras, it is divided into {(6, 5), (4, 6), 13, 5}. Here, when the expiratory paragraph mark is represented by "・", the reading is "Hocker'id Sapporo'Shi Minami '
K It becomes "Joza'nke Joe Sankate Yohako'zan walnut 'Sawa" and can read aloud naturally.

【００５１】単語間境界記号を設定する第２の方法とし
て、住所辞書103 中の地名単語の階層データ構造中に境
界記号を含めておき、それを参照してフレーズ境界又は
呼気段落境界等の韻律境界記号を設定する方法が考えら
れる。図19はこのアルゴリズムのフローチャートであ
る。地名単語の読みをバッファに設定する（Ｓ601 ）。
地名単語のポインタを次に進め（Ｓ602 ）、地名単語候
補列が終了か否かを判定する（Ｓ603 ）。地名単語候補
列が終了でない場合は韻律境界記号があれば獲得して設
定し（Ｓ604 ）、ステップＳ601 に戻って、地名単語候
補列が終了するまでステップＳ601 〜Ｓ604を繰り返
す。As a second method of setting the inter-word boundary symbol, the boundary symbol is included in the hierarchical data structure of the place name word in the address dictionary 103, and the prosody such as phrase boundary or exhalation paragraph boundary is referred to by referring to it. A method of setting a boundary symbol can be considered. FIG. 19 is a flowchart of this algorithm. The reading of the place name word is set in the buffer (S601).
The pointer of the place name word is advanced to the next (S602), and it is determined whether or not the place name word candidate sequence ends (S603). If the place name word candidate sequence is not finished, if there is a prosodic boundary symbol, it is acquired and set (S604), and the process returns to step S601 to repeat steps S601 to S604 until the place name word candidate sequence is finished.

【００５２】第２の方法では、モーラ数だけ見るのでは
なく、予めフレーズ境界及び呼気段落境界記号を区別し
て入れることができるので、より自然に発声できる。前
述の例で示すと、例えば『札幌市』と『南区』とを一緒
に発声した方がより自然であるため、それらの情報を住
所辞書103 に格納しておく。ここで、呼気段落境界記号
を「・」、フレーズ境界記号を「／」で表した場合、前
述の例を「ホッカ′イドー／サッポロ′シミナミ′ク
／ジョーザ′ンケー・ジョーザンケートヨハコ′ーザン
／クルミ′サワ」と読み上げることが可能になる。In the second method, not only the number of mora but also the phrase boundary and the expiratory paragraph boundary symbol can be distinguished and inserted in advance, so that the utterance can be made more naturally. In the above example, it is more natural to say "Sapporo City" and "Minami Ward" together, so that information is stored in the address dictionary 103. Here, when the expiratory paragraph boundary symbol is represented by "." And the phrase boundary symbol is represented by "/", the above-mentioned example is "Hocca'ido / Sapporo 'system. It is possible to read aloud as "Minami'ku / Joza'nke Joe Sankate Yohako'zan / walnut'Sawa."

【００５３】〔実施の形態８〕図20は本発明の音声合成
装置の実施の形態８の構成図であって、図中、上述の実
施の形態と同一部分には同一符号を付してその説明を省
略する。実施の形態８では、マイナス記号，長音記号等
で表記されている番地用記号を、番地用の読みに変換す
る番地用記号変換部119 が設けられている。即ち、『東
京都千代田区丸の内１−６−１』と表記された住所を含
む文章が入力された場合、番地『１−６−１』を「イチ
・マイナス・ロク・マイナス・イチ」と読み上げてしま
うと、住所の読みとして違和感を与える。そのため、住
所に上記のような連続する数値が表記されている場合
に、『１の６の１』、即ち、「イチ′ノ・ロク′ノ・イ
チ」と読み上げる。[Embodiment 8] FIG. 20 is a block diagram of an embodiment 8 of the speech synthesizer of the present invention. In the figure, the same parts as those in the above-mentioned embodiment are designated by the same reference numerals. The description is omitted. In the eighth embodiment, an address symbol conversion unit 119 for converting an address symbol represented by a minus sign, a long sound symbol, etc. into an address reading is provided. That is, when a sentence including an address written as "1-6-1 Marunouchi, Chiyoda-ku, Tokyo" is entered, the address "1-6-1" is read out as "Ichi-minus-lok-minus-ichi". If you do, it will give you an uncomfortable reading of the address. Therefore, when an address is written with consecutive numerical values as described above, it is read out as "1 of 6 of 1", that is, "Ichi'no-Roku'no-Ichi".

【００５４】図21はこのアルゴリズムのフローチャート
である。文章中に一致した住所辞書103 中の単語列から
決定される住所を認識した後、その住所区間に続く文字
を検索していく。次の文字が数字（０〜９、〇〜九、
十、百、千、万）か否かを判定し（Ｓ701 ）、数字の場
合はポインタを一文字分進め（Ｓ702 ）、ステップＳ70
1 に戻る。次の文字が数字でない場合は、次の単語が助
数詞（番地、番、丁目、号）か否かを判定し（Ｓ703
）、助数詞の場合はポインタを単語の文字分進め（Ｓ7
04 ）、ステップＳ701 に戻る。FIG. 21 is a flowchart of this algorithm. After recognizing the address determined from the word string in the address dictionary 103 that matches the sentence, the characters following the address section are searched. The next letter is a number (0-9, 〇-9,
It is determined whether it is ten, one hundred, ten, ten thousand) (S701), and if it is a numeral, the pointer is advanced by one character (S702), and step S70 is performed.
Return to 1. If the next character is not a number, it is determined whether or not the next word is a classifier (address, number, chome, number) (S703).
), In the case of classifiers, the pointer is advanced by the character of the word (S7
04), and returns to step S701.

【００５５】次の単語が助数詞でない場合は次の文字が
区切り記号（−、ー）か否かを判定し（Ｓ705 ）、区切
り記号の場合は文字を『の』で置き換える（Ｓ706 ）。
ここで、区切り記号として長音記号『ー』を含めたの
は、マイナス記号『−』を長音記号で誤って表記するケ
ースも多いので、誤って表記されていても読めるように
するためである。以上のいずれの文字、単語でもない場
合は処理を終了する。これらの処理を行うことで、数値
間の区切り記号文字を含む番地の記述を正しく読み上げ
ることができる。If the next word is not a classifier, it is determined whether the next character is a delimiter (-,-) (S705). If it is a delimiter, the character is replaced with "no" (S706).
Here, the long-sound symbol "-" is included as a delimiter so that the minus sign "-" is often mistakenly written as a long-sound symbol so that even if it is written incorrectly, it can be read. If none of the above characters or words is found, the process ends. By performing these processes, the description of the address including the delimiter character between the numerical values can be correctly read.

【００５６】〔実施の形態９〕図22は本発明の音声合成
装置の実施の形態９の構成図である。なお、上述の実施
例と同一部分には同一符号を付してその説明を省略す
る。例えば、『東京都豊島区立第三中学校』という文章
が入力された場合に、文章中に一致した住所辞書103 の
単語列から決定される住所区間のみを発音情報に変換し
てしまうと、「トーキョ′ートトシマ′ク」＋『立第
三中学校』と解釈し、結果的に文章解析部104 で『立』
を「タ′チ」と読んでしまう。[Ninth Embodiment] FIG. 22 is a block diagram of a ninth embodiment of the speech synthesizer of the present invention. The same parts as those in the above-described embodiment are designated by the same reference numerals and the description thereof will be omitted. For example, if the sentence “Toshima-ku Third Junior High School in Tokyo” is entered, if only the address section determined from the word string of the address dictionary 103 that matches the sentence is converted into pronunciation information, “Tokyo ′ Interpreted as "Toshima" + "Ritsu 3rd Junior High School", and as a result, "Ritsu" in the sentence analysis unit 104.
Is read as "Ta'chi".

【００５７】従って、実施の形態９では、『立、内、
民、行き、発、着、…』等の語彙からなる住所接尾語及
びアクセント結合属性等が格納されている住所接尾語辞
書120と、文章中の単語に一致した住所辞書103 中の単
語列から決定される住所区間の直後に続く単語に一致す
る単語を住所接尾語辞書120 から検索する住所接尾語辞
書検索部121 と、住所接尾語辞書検索部121 の検索によ
って住所の後ろに住所接尾語が存在した場合は、文章中
の単語に一致した住所辞書103 中の単語列からなる住所
区間に住所接尾語が含まれるように修正し、文章中の単
語に一致した住所辞書103 中の単語列の最終単語と住所
接尾語とを住所接尾語に設定されているアクセント結合
属性に応じてアクセント結合し、読みを設定するアクセ
ント結合処理部122 とが設けられている。Therefore, in the ninth embodiment, "standing, inside,
From the word strings in the address suffix dictionary 120 that stores the address suffixes and accent combining attributes and the like in the vocabulary such as “people, going, originating, arriving, ...” and the address dictionary 103 that matches the words in the sentence. The address suffix dictionary search unit 121 that searches the address suffix dictionary 120 for a word that matches the word that immediately follows the determined address section, and the address suffix dictionary search unit 121 searches for an address suffix after the address. If it exists, correct it so that the address section consisting of the word string in the address dictionary 103 that matches the word in the sentence includes the address suffix, and change the word string in the address dictionary 103 that matches the word in the sentence. An accent combination processing unit 122 is provided for combining the final word and the address suffix according to the accent combination attribute set in the address suffix, and setting the reading.

【００５８】図23は実施の形態９のアルゴリズムのフロ
ーチャートである。住所の単語列の次の単語が住所接尾
語辞書120 にあるか否かを判定し（Ｓ801）、住所接尾
語辞書120 にある場合はアクセント結合処理部122 がア
クセント結合する（Ｓ802 ）。このような処理を行うこ
とで、先の例の文章情報を「トーキョ′ートトシマ
ク′リツ」＋『第三中学校』という情報として文章解析
部105 に渡せるため、正しく読み上げることができる。FIG. 23 is a flowchart of the algorithm of the ninth embodiment. It is determined whether or not the next word of the word string of the address is in the address suffix dictionary 120 (S801), and if it is in the address suffix dictionary 120, the accent combination processing unit 122 performs accent combination (S802). By performing such processing, the text information in the previous example can be converted to "Tokyo" The information "Toshimak'Ritsu" + "3rd junior high school" can be passed to the sentence analysis unit 105, so that it can be read correctly.

【００５９】〔実施の形態10〕図24は本発明の音声合成
装置の実施の形態10の構成図である。なお、上述の実施
例と同一部分には同一符号を付してその説明を省略す
る。例えば、文章中に住所が出現した後で、その住所に
含まれる単語が繰り返し出現する場合、例えば、『神奈
川県中原区上小田中に、上小田中公民館はあります。』
という文章で、地名の『神奈川県中原区上小田中』は正
しく読めるが、次に出てくる『上小田中公民館は…』と
いう文章に関しては、先で読めているにもかかわらず、
同じ表記で別の読みが登録されているために間違った読
みで読み上げることがある。また、実施の形態６のよう
な構成を採っている場合、文字数（単語数）が所定のし
きい値より少ないために住所読みで処理できないことも
起こり得る。[Embodiment 10] FIG. 24 is a block diagram of a speech synthesizer according to Embodiment 10 of the present invention. The same parts as those in the above-described embodiment are designated by the same reference numerals and the description thereof will be omitted. For example, if an address appears repeatedly in a sentence and then the words contained in that address appear repeatedly, for example, "Kamiotadaka Community Center is in Kamiodanaka, Nakahara-ku, Kanagawa Prefecture." 』
In this sentence, the place name "Kamiotadanaka, Nakahara-ku, Kanagawa" can be read correctly, but the sentence "Kamiotadaka public hall is ..." that appears next, despite having read it earlier,
You may hear a wrong reading because another reading is registered with the same notation. Further, in the case of adopting the configuration as in the sixth embodiment, it may happen that the address reading cannot be performed because the number of characters (the number of words) is less than a predetermined threshold value.

【００６０】従って、実施の形態10では、住所辞書検索
部102 で処理された住所の全ての構成単語に対して、漢
字表記と発音情報とをペアにして学習用単語バッファ12
4 に格納する住所単語学習部123 と、最近使用した住所
の表記と読みとを格納する学習用単語バッファ124 と、
階層構造の住所辞書103 を検索した後の文章に対して学
習用単語の検索を行い、学習用単語に一致する単語が存
在する場合には、該当する文章の部分に、対応する発音
情報を埋め込む学習用単語検索部125 とが設けられてい
る。学習用単語バッファ124 に格納されている単語は、
その読みを優先的に使用することによって、文章中に一
度でも出現した住所の一部が次に出現した場合に正しく
読み上げることができる。Therefore, in the tenth embodiment, for all the constituent words of the address processed by the address dictionary search unit 102, a kanji notation and pronunciation information are paired to make the learning word buffer 12
4, an address word learning unit 123 to store in 4, a learning word buffer 124 to store the notation and reading of recently used addresses,
After searching the hierarchical address dictionary 103, the sentence for learning is searched for, and if there is a word that matches the learning word, the corresponding pronunciation information is embedded in the corresponding sentence. A learning word search unit 125 is provided. The words stored in the learning word buffer 124 are
By preferentially using the reading, it is possible to correctly read when a part of the address that appears even once in the sentence appears next time.

【００６１】また、学習用単語バッファ124 内の内容を
初期化する学習用単語削除部126 を設けてもよい。この
場合、学習用単語バッファ124 は通常は初期化せずに使
用者の指定に応じて初期化する構成であっても、また、
文章情報の入力の都度、初期化する構成であってもよ
い。さらに、登録された後に入力された文章の数が所定
数を超えた時点で初期化する構成であってもよい。A learning word deleting unit 126 for initializing the contents in the learning word buffer 124 may be provided. In this case, the learning word buffer 124 is not initialized normally, but may be initialized according to the user's designation.
The configuration may be such that initialization is performed each time text information is input. Furthermore, the configuration may be such that initialization is performed when the number of sentences input after registration exceeds a predetermined number.

【００６２】なお、本発明の音声合成装置において実施
される音声合成方法は、音声合成装置のＲＯＭに書き込
んでおく以外に、図25に示すように、コンパクトディス
ク等の記録媒体Ｄに記録しておき、この記録媒体Ｄをパ
ーソナルコンピュータのディスクドライブに装填して音
声合成する構成であってもよい。Note that the voice synthesizing method implemented in the voice synthesizing apparatus of the present invention is not limited to writing in the ROM of the voice synthesizing apparatus, but is also recorded in a recording medium D such as a compact disc as shown in FIG. Alternatively, the recording medium D may be loaded in a disc drive of a personal computer to synthesize voice.

【００６３】なお、上述の実施の形態では単語列が住所
の場合について説明したが、単語列は住所に限らず、階
層構造を有する単語列であれば同様の効果が得られる。In the above embodiment, the case where the word string is an address has been described, but the word string is not limited to an address and the same effect can be obtained as long as the word string has a hierarchical structure.

【００６４】[0064]

【発明の効果】以上のように、本発明の音声合成装置、
音声合成方法及びこの方法のコンピュータプログラムを
記録している記録媒体は、一つの表記に対して複数の読
みが存在する住所等の表記及び読みの情報を階層化した
階層構造の辞書を持つので、一つの表記に対して複数の
読みが存在する住所等の文章情報を正確な読みで読み上
げるという優れた効果を奏する。As described above, the speech synthesizer of the present invention,
A voice synthesis method and a recording medium recording a computer program of this method have a hierarchical dictionary in which notation such as an address in which a plurality of readings exist for one notation and reading information are hierarchized. This has an excellent effect of reading text information such as an address in which plural readings exist for one notation with correct reading.

[Brief description of drawings]

【図１】本発明の音声合成装置の基本ブロック図であ
る。FIG. 1 is a basic block diagram of a speech synthesizer of the present invention.

【図２】本発明の音声合成装置の一例の模式図である。FIG. 2 is a schematic diagram of an example of a speech synthesizer of the present invention.

【図３】本発明の音声合成装置の実施の形態１の構成図
である。FIG. 3 is a configuration diagram of a first embodiment of a speech synthesizer of the present invention.

【図４】住所辞書の一例の概念図である。FIG. 4 is a conceptual diagram of an example of an address dictionary.

【図５】実施の形態１のアルゴリズムのフローチャート
である。FIG. 5 is a flowchart of the algorithm of the first embodiment.

【図６】本発明の音声合成装置の実施の形態２の構成図
である。FIG. 6 is a configuration diagram of a second embodiment of a speech synthesizer of the present invention.

【図７】本発明の音声合成装置の実施の形態３及びその
変形例の構成図である。FIG. 7 is a configuration diagram of a third embodiment of the speech synthesis apparatus of the present invention and its modification.

【図８】検索開始位置格納バッファの概念図である。FIG. 8 is a conceptual diagram of a search start position storage buffer.

【図９】本発明の音声合成装置の実施の形態４の構成図
である。FIG. 9 is a configuration diagram of a fourth embodiment of a speech synthesizer of the present invention.

【図１０】住所辞書の他の例の概念図である。FIG. 10 is a conceptual diagram of another example of an address dictionary.

【図１１】実施の形態４のアルゴリズムのフローチャー
ト（その１）である。FIG. 11 is a flowchart (part 1) of the algorithm of the fourth embodiment.

【図１２】実施の形態４のアルゴリズムのフローチャー
ト（その２）である。FIG. 12 is a flowchart (part 2) of the algorithm of the fourth embodiment.

【図１３】本発明の音声合成装置の実施の形態５の構成
図である。FIG. 13 is a configuration diagram of a fifth embodiment of a speech synthesizer of the present invention.

【図１４】本発明の音声合成装置の実施の形態５の変形
例の構成図である。FIG. 14 is a configuration diagram of a modification of the fifth embodiment of the speech synthesizer of the present invention.

【図１５】本発明の音声合成装置の実施の形態６の構成
図である。FIG. 15 is a configuration diagram of a sixth embodiment of a speech synthesizer of the present invention.

【図１６】実施の形態６のアルゴリズムのフローチャー
トである。FIG. 16 is a flowchart of an algorithm according to the sixth embodiment.

【図１７】本発明の音声合成装置の実施の形態７の構成
図である。FIG. 17 is a configuration diagram of a seventh embodiment of a speech synthesizer of the present invention.

【図１８】実施の形態７のアルゴリズムのフローチャー
ト（その１）である。FIG. 18 is a flowchart (part 1) of the algorithm of the seventh embodiment.

【図１９】実施の形態７のアルゴリズムのフローチャー
ト（その２）である。FIG. 19 is a flowchart (part 2) of the algorithm of the seventh embodiment.

【図２０】本発明の音声合成装置の実施の形態８の構成
図である。FIG. 20 is a configuration diagram of an eighth embodiment of a speech synthesizer of the present invention.

【図２１】実施の形態８のアルゴリズムのフローチャー
トである。FIG. 21 is a flowchart of the algorithm of the eighth embodiment.

【図２２】本発明の音声合成装置の実施の形態９の構成
図である。[Fig. 22] Fig. 22 is a configuration diagram of a ninth embodiment of a speech synthesis device of the present invention.

【図２３】実施の形態９のアルゴリズムのフローチャー
トである。FIG. 23 is a flowchart of the algorithm according to the ninth embodiment.

【図２４】本発明の音声合成装置の実施の形態10の構成
図である。[Fig. 24] Fig. 24 is a configuration diagram of a tenth embodiment of a speech synthesis device of the present invention.

【図２５】本発明の記録媒体の記録状態の概念図であ
る。FIG. 25 is a conceptual diagram of a recording state of the recording medium of the present invention.

[Explanation of symbols]

１文章入力部２階層辞書検索部３階層辞書４文章解析部５基本辞書６音声波形生成部７スピーカ 1 Text Input Section 2 Hierarchical Dictionary Search Section 3 Hierarchical Dictionary 4 Text Analysis Section 5 Basic Dictionary 6 Speech Waveform Generation Section 7 Speaker

───────────────────────────────────────────────────── フロントページの続き (72)発明者辻内秀敏神奈川県川崎市中原区上小田中1015番地富士通株式会社内 (72)発明者木村晋太神奈川県川崎市中原区上小田中1015番地富士通株式会社内 ─────────────────────────────────────────────────── ─── Continuation of the front page (72) Hidetoshi Tsujiuchi 1015 Kamiodanaka, Nakahara-ku, Kawasaki-shi, Kanagawa Fujitsu Limited (72) Inventor Shinta Kimura 1015, Kamitadanaka, Nakahara-ku, Kawasaki, Kanagawa Fujitsu Limited

Claims

[Claims]

1. A voice synthesizer for synthesizing a voice from text information, each of a word string comprising an input means of text information and a word group in which a word appearing next to the word determines the reading of the word appearing next to the word appearing first. From the hierarchical dictionary, a hierarchical dictionary in which the notation of the word and the reading information are stored together with the information of the hierarchical structure in which the words are hierarchized according to the connection order, and the word reading candidates included in the input sentence information A sentence in which the reading of the word string included in the sentence information is selected based on the hierarchical structure information from the hierarchical dictionary search means for searching and the reading candidate, and the word string is converted into the reading. A voice synthesizing apparatus comprising: an analysis unit, a voice waveform generation unit that generates a voice waveform from reading information, and a voice waveform output unit.

2. The speech synthesizer according to claim 1, further comprising means for designating an area, wherein the hierarchical dictionary search means is means for starting the search from a hierarchy belonging to the designated area.

3. The information of the hierarchical structure is information for specifying a parent word in a higher hierarchy of each word whose reading should be determined as the reading, and refers to the reading candidate information searched from the hierarchical dictionary. 3. The speech synthesizer according to claim 1, further comprising a connection relation setting unit that sets a connection relation of candidates for reading the words of the word string included in the sentence information.

4. The hierarchical dictionary searching means includes means for searching a hierarchical dictionary based on a word string in which notation of a word in one of the word strings is omitted. The speech synthesizer according to.

5. The reading of the hierarchical dictionary is referred to as the reading of the character or word when the notation of the predetermined number of characters or words matches the notation of the character or word included in the word string stored in the hierarchical dictionary. The speech synthesis apparatus according to any one of claims 1 to 4, further comprising a reading determination unit for determining.

6. A suffix dictionary storing notation and reading information of suffixes connected to the word string and information of notation that matches the notation immediately after the word string are searched from the suffix dictionary. 6. The speech synthesis apparatus according to claim 1, further comprising: a suffix dictionary search unit that selects a reading of the suffix dictionary that matches the immediately following notation as the reading of the immediately following notation.

7. A notation of each word of a word string consisting of a word group in which the reading of the word appearing next to the word is determined by the word appearing first and the reading information is a hierarchy in which the words are hierarchized according to the connection order. A speech synthesis method for synthesizing speech from sentence information by referring to a hierarchical dictionary stored together with structure information, in which sentence information is input, and word reading candidates included in the inputted sentence information are searched. Searching from the hierarchical dictionary, selecting the reading of the word string included in the sentence information from the reading candidates based on the information of the hierarchical structure, converting the word string into the reading, and reading A voice synthesizing method comprising generating a voice waveform from information and outputting the voice waveform.

8. A hierarchy in which the notation of each word of a word string consisting of a word group in which the reading of the word that appears next to the word that appears first and the reading information are hierarchized according to the connection order. The step of inputting the hierarchical dictionary stored together with the structure information and the sentence information, and searching the hierarchical dictionary for a candidate for the reading of the word included in the input sentence information, the sentence from the reading candidates Selecting the reading of the word string contained in the information based on the information of the hierarchical structure, converting the word string into the reading, generating a speech waveform from the reading information, and outputting the speech waveform And a computer program including a step of performing recording.