JPH1083196A

JPH1083196A - Voice synthesizer, method therefor and information recording medium

Info

Publication number: JPH1083196A
Application number: JP8236006A
Authority: JP
Inventors: Naoko Satou; 奈穂子佐藤
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 1996-09-06
Filing date: 1996-09-06
Publication date: 1998-03-31

Abstract

PROBLEM TO BE SOLVED: To appropriately set pause position in the voice output of text data with a simple processing. SOLUTION: Inputted text data are temporarily stored by a data storage means 22 and the text data are punctuated for every continued clause by a continued clause dividing means 23 and the pause position is set in the text data by a pause setting means 24 based on the punctuation positions. Since the pause position of the text data is set based on punctuation positions for every continued clause, the pause position is appropriately set with the simple data processing.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、音声合成装置およ
び方法、情報記憶媒体に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a speech synthesizing apparatus and method, and an information storage medium.

【０００２】[0002]

【従来の技術】現在、コンピュータを利用した各種装置
が実用化されており、例えば、日本語を音声出力する音
声合成装置も開発されている。このような音声合成装置
では、日本語の自然言語のテキストデータが入力される
と、これを言語解析して発音記号列に変換し、この発音
記号列から合成音声を生成して出力する。2. Description of the Related Art At present, various devices using a computer have been put into practical use, and for example, a voice synthesizer for outputting Japanese as voice has been developed. In such a speech synthesizer, when text data of Japanese natural language is input, it is language analyzed and converted into a phonetic symbol string, and a synthesized speech is generated from the phonetic symbol string and output.

【０００３】しかし、テキストデータを単純に音声出力
すると、いわゆる棒読みの状態となり、人間には理解し
にくい不自然なものとなる。そこで、人間の読み上げと
同様に、所定の位置でテキストデータを区切って音声出
力することが要望されている。このため、一般的にはテ
キストデータの句読点や文節の位置やモーラ長を利用し
て、テキストデータに音声出力での区切位置であるポー
ズ位置を設定している。[0003] However, when text data is simply output as speech, it is in a so-called stick reading state, which is unnatural and difficult for humans to understand. Therefore, there is a demand to output text by separating text data at a predetermined position as in the case of human reading aloud. For this reason, generally, a pause position, which is a delimiter position in voice output, is set in text data using punctuation marks, passage positions, and mora lengths of the text data.

【０００４】[0004]

【発明が解決しようとする課題】上述のような音声合成
装置は、テキストデータを自然に音声出力することを目
的としている。SUMMARY OF THE INVENTION The above-described speech synthesizer aims to naturally output text data as speech.

【０００５】しかし、これでは修飾関係が複雑な名詞句
や副詞句が多く含まれる長いテキストデータの場合、そ
の意味内容を正確に読み取れずポーズ位置が適正に設定
されないことが多い。このことを解消するため、各種の
手法が創案されている。However, in this case, in the case of long text data containing many noun phrases and adverb phrases with complicated modification relations, the semantic content cannot be accurately read, and the pause position is often not set properly. To solve this, various methods have been devised.

【０００６】例えば、特開平6-342297号公報には、テキ
ストデータを構文解析してポーズ位置を設定することが
記載されており、特公平3-237499号公報には、所定の隣
接単語関係と累積モーラ数とに基づいてポーズ位置を設
定することが記載されており、特公平6−59695号公報に
は、局所的な語句の係り受け関係に基づいてポーズ位置
を設定することが記載されている。For example, Japanese Patent Application Laid-Open No. Hei 6-342297 describes that text data is parsed to set a pause position. It is described that a pause position is set based on the cumulative number of mora, and Japanese Patent Publication No. 6-59695 describes that a pause position is set based on a local dependency relationship between words and phrases. I have.

【０００７】しかし、上述した各種の手法では、何れも
ポーズ位置の設定に複雑な処理が必要であり、構文解析
等の処理用データの精度に結果が左右されやすい。ま
た、特開平6-342297号公報や特公平3-237499号公報の手
法では、ポーズの位置を設定するだけで長さを調節しな
いので、その音声出力は不自然なものとなる。特公平6
−59695号公報には、句境界の性質等の各種要因に基づ
いてポーズの長さを調節することも記載されているが、
これではポーズ長さの調節にも複雑な処理が必要とな
る。However, in each of the various methods described above, complicated processing is required for setting the pause position, and the result is easily affected by the accuracy of processing data such as syntax analysis. In the method disclosed in Japanese Patent Application Laid-Open No. Hei 6-342297 and Japanese Patent Publication No. Hei 3-237499, the length of the pause is not adjusted but the sound output is unnatural. Tokuhei 6
Although -59695 also describes adjusting the length of the pose based on various factors such as the nature of phrase boundaries,
In this case, complicated processing is required for adjusting the pause length.

【０００８】[0008]

【課題を解決するための手段】請求項１記載の発明の音
声合成装置は、各種データの入力を受け付けるデータ入
力デバイスと、各種データを音声出力する音声出力デバ
イスと、各種データを一時記憶するデータ記憶デバイス
と、前記データ入力デバイスに入力される日本語の自然
言語のテキストデータを受け付けるデータ入力手段と、
入力されたテキストデータを前記データ記憶デバイスに
一時記憶させるデータ記憶手段と、一時記憶されたテキ
ストデータを連文節毎に区切る連文節分割手段と、連文
節間の区切位置に基づいて音声出力でのポーズ位置をテ
キストデータに設定するポーズ設定手段と、テキストデ
ータをポーズ位置で所定時間ずつ区切りながら前記音声
出力デバイスに音声出力させる音声出力手段とを有す
る。従って、日本語の自然言語のテキストデータは、デ
ータ入力手段がデータ入力デバイスにより入力が受け付
けられ、データ記憶手段によりデータ記憶デバイスに格
納され、連文節分割手段により連文節毎に区切られ、こ
の区切位置に基づいてポーズ設定手段によりポーズ位置
が設定され、このポーズ位置で所定時間ずつ区切られな
がら音声出力手段により音声出力デバイスから音声出力
される。テキストデータのポーズ位置が連文節間の区切
位置に基づいて設定されるので、簡単なデータ処理でポ
ーズ位置が適正に設定される。According to a first aspect of the present invention, there is provided a voice synthesizing apparatus, comprising: a data input device for receiving input of various data; a voice output device for outputting voice of various data; and a data for temporarily storing various data. A storage device, data input means for receiving Japanese natural language text data input to the data input device,
Data storage means for temporarily storing the input text data in the data storage device; continuous phrase dividing means for partitioning the temporarily stored text data for each continuous phrase; and a pause position in voice output based on a break position between the continuous phrases. The audio output device includes a pause setting unit that sets text data and a voice output unit that outputs text to the voice output device while separating the text data at a pause position by a predetermined time. Therefore, the text data of the Japanese natural language is received by the data input device by the data input device, stored in the data storage device by the data storage device, and separated by the continuous phrase dividing device for each continuous phrase. A pause position is set by the pause setting means based on the pause position, and the sound is output from the sound output device by the sound output means while being delimited by the pause position for a predetermined time. Since the pause position of the text data is set based on the break position between consecutive phrases, the pause position is appropriately set by simple data processing.

【０００９】請求項２記載の発明では、請求項１記載の
音声合成装置において、テキストデータを形成する単語
が品詞の情報とともに設定された単語辞書を設け、連文
節を形成する品詞列が設定された品詞列辞書を設け、連
文節分割手段は、前記単語辞書の設定内容に従ってテキ
ストデータを単語毎に分割して各々の品詞を判定し、こ
のテキストデータの品詞と前記品詞列辞書の設定内容と
を照合させてテキストデータを連文節毎に区切る。従っ
て、テキストデータを連文節毎に区切る連文節分割手段
の処理動作が、テキストデータの単語の品詞に基づいて
実行される。According to a second aspect of the present invention, in the speech synthesizer according to the first aspect, a word dictionary in which words forming text data are set together with part-of-speech information is provided, and a part-of-speech sequence forming a continuous phrase is set. A part-of-speech sequence dictionary is provided, and the continuous phrase segmentation unit divides the text data into words according to the setting contents of the word dictionary, determines each part-of-speech, and compares the part of speech of the text data with the setting contents of the part-of-speech sequence dictionary. Then, the text data is delimited for each continuous clause. Therefore, the processing operation of the continuous phrase dividing means for partitioning the text data for each continuous phrase is executed based on the part of speech of the word of the text data.

【００１０】請求項３記載の発明では、請求項１記載の
音声合成装置において、連文節分割手段は、連文節の形
成に関する所定の文法規則に基づいてテキストデータか
ら連文節の候補を検出し、この候補から最尤解を選択し
て連文節を確定する。従って、テキストデータを連文節
毎に区切る連文節分割手段の処理動作が、連文節の形成
に関する文法規則に基づいて実行される。According to a third aspect of the present invention, in the speech synthesizing apparatus according to the first aspect, the continuous phrase dividing means detects a continuous phrase candidate from the text data based on a predetermined grammatical rule relating to the formation of the continuous phrase, and from the candidate. The maximum likelihood solution is selected to determine the continuous clause. Therefore, the processing operation of the continuous phrase dividing means for dividing the text data into the continuous phrases is executed based on the grammatical rule regarding the formation of the continuous phrase.

【００１１】請求項４記載の発明では、請求項１記載の
音声合成装置において、テキストデータから検出された
連文節の各々の文法的な機能を個々に判定する役割判定
手段を設け、ポーズ設定手段は、判定された連文節の機
能にも対応してポーズ位置を設定する。従って、テキス
トデータから検出された連文節の各々の文法的な機能が
役割判定手段により個々に判定され、この連文節の機能
にも対応してポーズ設定手段によりテキストデータにポ
ーズ位置が設定されるので、このポーズ位置が連文節の
文法的な機能に対応して適正に設定される。According to a fourth aspect of the present invention, in the voice synthesizing apparatus according to the first aspect, a role determining means for individually determining a grammatical function of each continuous phrase detected from the text data is provided, and the pause setting means is provided. The pause position is set in accordance with the function of the determined consecutive clause. Therefore, each grammatical function of the continuous phrase detected from the text data is individually determined by the role determining unit, and the pause position is set in the text data by the pause setting unit in accordance with the function of the continuous phrase. This pause position is set appropriately in accordance with the grammatical function of the continuous phrase.

【００１２】請求項５記載の発明では、請求項４記載の
音声合成装置において、連文節の文法的な機能が設定さ
れた連文節機能辞書を設け、役割判定手段は、テキスト
データから検出された連文節の機能を前記連文節機能辞
書の設定内容に対応して判定する。従って、連文節の文
法的な機能を判定する役割判定手段の処理動作が、連文
節機能辞書の設定内容に対応して簡単に実行される。According to a fifth aspect of the present invention, in the voice synthesizing apparatus according to the fourth aspect, a continuous phrase function dictionary in which the grammatical function of the continuous phrase is set is provided, and the role determining unit determines the continuous phrase detected from the text data. The function is determined according to the setting contents of the continuous phrase function dictionary. Therefore, the processing operation of the role determining means for determining the grammatical function of the continuous phrase is easily executed in accordance with the setting contents of the continuous phrase function dictionary.

【００１３】請求項６記載の発明では、請求項４記載の
音声合成装置において、役割判定手段は、連文節内の係
り受け関係に関する所定の文法規則に基づいて連文節の
機能を判定する。従って、連文節の文法的な機能を判定
する役割判定手段の処理動作が、連文節内の係り受け関
係に関する所定の文法規則に対応して簡単に実行され
る。According to a sixth aspect of the present invention, in the voice synthesizing apparatus according to the fourth aspect, the role determining means determines the function of the continuous phrase based on a predetermined grammatical rule regarding a dependency relationship in the continuous phrase. Therefore, the processing operation of the role determining means for determining the grammatical function of the continuous phrase is easily executed in accordance with the predetermined grammatical rule regarding the dependency relation in the continuous phrase.

【００１４】請求項７記載の発明では、請求項４ないし
６の何れか一記載の記載の音声合成装置において、テキ
ストデータでの区切位置と順番と機能とを少なくとも含
む連文節の各種情報をデータ記憶デバイスに一時記憶さ
せる情報記憶手段を設け、一時記憶された各種情報に基
づいて連文節間の関係を判定する関係判定手段を設け、
判定された関係に基づいてポーズ位置で音声出力を区切
るポーズ時間を個々に調節する時間調節手段を設けた。
従って、テキストデータでの連文節の各種情報が情報記
憶手段によりデータ記憶デバイスに格納され、この各種
情報に基づいて関係判定手段により連文節間の関係が判
定され、この関係に基づいてポーズ位置でのポーズ時間
が時間調節手段により個々に調節される。つまり、テキ
ストデータの音声出力でのポーズ時間の各々が、連文節
間の関係に対応して適正に調節される。According to a seventh aspect of the present invention, in the speech synthesizing apparatus according to any one of the fourth to sixth aspects, various types of information of a continuous paragraph including at least a delimiter position, an order and a function in text data are stored. Providing information storage means for temporarily storing in the device, provided with a relationship determination means for determining the relationship between consecutive clauses based on the temporarily stored various information,
A time adjusting means for individually adjusting a pause time for separating the audio output at the pause position based on the determined relationship is provided.
Therefore, various information of the continuous phrases in the text data is stored in the data storage device by the information storage means, and the relationship between the continuous phrases is determined by the relationship determining means based on the various information, and the pause at the pause position is determined based on the relationship. The time is adjusted individually by the time adjusting means. That is, each pause time in the audio output of the text data is appropriately adjusted in accordance with the relationship between consecutive phrases.

【００１５】請求項８記載の発明では、請求項７記載の
音声合成装置において、隣接する連文節の機能の接続尤
度が設定された接続尤度辞書を設け、時間調節手段は、
接続尤度にも対応してポーズ時間を個々に調節する。従
って、ポーズ時間を個々に調節する時間調節手段の処理
動作が、隣接する連文節の機能の接続尤度にも対応して
適正に実行される。According to the invention described in claim 8, in the speech synthesizer according to claim 7, a connection likelihood dictionary in which connection likelihoods of functions of adjacent clauses are set is provided, and the time adjusting means includes:
The pause time is individually adjusted according to the connection likelihood. Therefore, the processing operation of the time adjusting means for individually adjusting the pause time is appropriately executed in accordance with the connection likelihood of the function of the adjacent clauses.

【００１６】請求項９記載の発明では、請求項７または
８記載の音声合成装置において、テキストデータを所定
の語句毎に区切る語句分割手段を設け、ポーズ設定手段
は、連文節内の語句間の分割位置にもポーズ位置を設定
し、時間調節手段は、語句間のポーズ時間より連文節間
のポーズ時間を長く設定する。従って、テキストデータ
の連文節間に長いポーズ時間が設定されるとともに、連
文節内の語句間に短いポーズ時間が設定されるので、音
声出力されるテキストデータは語句間の位置では短く区
切られ連文節間の位置では長く区切られる。According to a ninth aspect of the present invention, in the voice synthesizing apparatus according to the seventh or eighth aspect, phrase dividing means for dividing the text data for each predetermined phrase is provided, and the pause setting means is configured to divide the text data between the phrases in the continuous phrase. The pause position is also set at the position, and the time adjusting means sets the pause time between successive phrases longer than the pause time between words and phrases. Therefore, a long pause time is set between consecutive phrases in the text data, and a short pause time is set between words in the consecutive phrases, so that the text data to be output as speech is separated short at the positions between the phrases and between the consecutive phrases. The location is long separated.

【００１７】請求項１０記載の発明の音声合成方法は、
日本語の自然言語のテキストデータの入力を受け付け、
入力されたテキストデータを一時記憶し、一時記憶され
たテキストデータを連文節毎に区切り、連文節間の区切
位置に基づいて音声出力でのポーズ位置をテキストデー
タに設定し、このテキストデータをポーズ位置で所定時
間ずつ区切りながら音声出力するようにした。従って、
入力された日本語の自然言語のテキストデータは、連文
節毎の区切位置に対応したポーズ位置で所定時間ずつ区
切られながら音声出力される。テキストデータのポーズ
位置が連文節間の区切位置に基づいて設定されるので、
簡単なデータ処理でポーズ位置が適正に設定される。According to a tenth aspect of the present invention, there is provided a speech synthesis method comprising:
Accepts input of text data in Japanese natural language,
The input text data is temporarily stored, the temporarily stored text data is separated for each continuous phrase, the pause position in the voice output is set to the text data based on the separation position between the continuous phrases, and this text data is stored in the pause position. Audio output is performed at predetermined time intervals. Therefore,
The input text data of the Japanese natural language is output as audio while being separated for a predetermined time at a pause position corresponding to a break position for each continuous phrase. Since the pause position of the text data is set based on the break position between consecutive phrases,
The pause position is set properly by simple data processing.

【００１８】請求項１１記載の発明の情報記憶媒体は、
コンピュータに、データ入力デバイスに入力される日本
語の自然言語のテキストデータを受け付けること、入力
されたテキストデータをデータ記憶デバイスに一時記憶
させること、一時記憶されたテキストデータを連文節毎
に区切ること、連文節間の区切位置に基づいて音声出力
でのポーズ位置をテキストデータに設定すること、テキ
ストデータをポーズ位置で所定時間ずつ区切りながら音
声出力デバイスに音声出力させること、を実行させるた
めのプログラムが記録されている。従って、データ入力
デバイスとデータ記憶デバイスと音声出力デバイスとが
接続されたコンピュータに、このプログラムを読み取ら
せて対応する動作を実行させると、データ入力デバイス
に入力される日本語の自然言語のテキストデータがデー
タ記憶デバイスにより一時記憶されて連文節毎に区切ら
れ、この連文節間の区切位置に基づいて音声出力でのポ
ーズ位置がテキストデータに設定され、このテキストデ
ータがポーズ位置で所定時間ずつ区切られながら音声出
力デバイスにより音声出力される。テキストデータのポ
ーズ位置が連文節間の区切位置に基づいて設定されるの
で、簡単なデータ処理でポーズ位置が適正に設定され
る。An information storage medium according to the invention described in claim 11 is:
Accepting Japanese natural language text data input to the data input device to the computer, temporarily storing the input text data in the data storage device, separating the temporarily stored text data into continuous clauses, A program is recorded to execute a process of setting a pause position in voice output to text data based on a break position between consecutive clauses, and outputting a voice to a voice output device while separating the text data by a predetermined time at pause positions. Have been. Therefore, when a computer to which a data input device, a data storage device, and a voice output device are connected is made to read this program and execute a corresponding operation, text data of Japanese natural language input to the data input device is input. Is temporarily stored by the data storage device and is separated for each continuous phrase, and the pause position in the audio output is set in the text data based on the separation position between the continuous phrases, and the text data is separated for a predetermined time at the pause position. The sound is output by the sound output device. Since the pause position of the text data is set based on the break position between consecutive phrases, the pause position is appropriately set by simple data processing.

【００１９】請求項１２記載の発明の情報記憶媒体は、
コンピュータに、テキストデータを連文節毎に区切るこ
と、連文節間の区切位置に基づいて音声出力でのポーズ
位置をテキストデータに設定すること、を実行させるた
めのプログラムが記録されている。従って、データ入力
デバイスとデータ記憶デバイスと音声出力デバイスとを
備えた音声合成装置のコンピュータに、このプログラム
を読み取らせて対応する動作を実行させると、データ入
力デバイスに入力される日本語の自然言語のテキストデ
ータがデータ記憶デバイスにより一時記憶されて連文節
毎に区切られ、この連文節間の区切位置に基づいて音声
出力でのポーズ位置がテキストデータに設定され、この
テキストデータがポーズ位置で所定時間ずつ区切られな
がら音声出力デバイスにより音声出力される。テキスト
データのポーズ位置が連文節間の区切位置に基づいて設
定されるので、簡単なデータ処理でポーズ位置が適正に
設定される。According to a twelfth aspect of the present invention, there is provided an information storage medium comprising:
A program is recorded in the computer for causing the computer to separate text data for each continuous phrase, and to set a pause position in audio output in the text data based on the partition position between the continuous phrases. Therefore, when a computer of a speech synthesizer having a data input device, a data storage device, and a speech output device is caused to read this program and execute a corresponding operation, a Japanese natural language input to the data input device is obtained. Is temporarily stored by the data storage device and is divided for each continuous phrase, and a pause position in voice output is set in the text data based on the break position between the continuous phrases, and the text data is stored at the pause position for a predetermined time. The voice is output by the voice output device while being separated. Since the pause position of the text data is set based on the break position between consecutive phrases, the pause position is appropriately set by simple data processing.

【００２０】[0020]

【発明の実施の形態】本発明の実施の第一の形態を、図
１ないし図５を参考に以下に詳述する。本実施の形態の
音声合成装置１は、そのハードウェアとしてデータ処理
装置であるコンピュータシステムを有している。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS A first embodiment of the present invention will be described below in detail with reference to FIGS. The speech synthesis device 1 of the present embodiment has a computer system as a data processing device as hardware thereof.

【００２１】このコンピュータシステムは、図２および
図３に示すように、コンピュータの主体としてＣＰＵ(C
entral Processing Unit）２を有しており、このＣＰＵ
２には、バスライン３により、ＲＯＭ(Read Only Memor
y)４、ＲＡＭ(Random AccessMemory)５、ＨＤＤ(Hard D
isc Drive）６、ＦＤ(Floppy Disc）７が装填されるＦ
ＤＤ(FD Drive)８、ＣＤ(Compact Disc)−ＲＯＭ９が装
填されるＣＤ−ＲＯＭドライブ１０、マウス１１が接続
されたキーボード１２、ディスプレイ１３、通信Ｉ／Ｆ
(Interface）１４、マイクロフォン１５、スピーカ１
６、等が接続されている。As shown in FIGS. 2 and 3, this computer system has a CPU (C
central processing unit) 2 and this CPU
2 has a ROM (Read Only Memory)
y) 4, RAM (Random Access Memory) 5, HDD (Hard D)
isc Drive) 6, F to which FD (Floppy Disc) 7 is loaded
DD (FD Drive) 8, CD-ROM drive 10 loaded with CD (Compact Disc) -ROM 9, keyboard 12 to which mouse 11 is connected, display 13, communication I / F
(Interface) 14, microphone 15, speaker 1
6, etc. are connected.

【００２２】このようなコンピュータシステムは、各種
手法により各種データの外部入力を受け付けるため、デ
ータ入力を受け付けるデータ入力デバイスとして、前記
ドライブ８，９、前記マウス１１および前記キーボード
１２、前記通信Ｉ／Ｆ１４、前記マイクロフォン１５、
等を有している。データ出力を実行するデータ出力デバ
イスとしては、前記ＦＤＤ８、前記ディスプレイ１３、
前記通信Ｉ／Ｆ１４、前記スピーカ１６、等を有してお
り、特に、このスピーカ１６は、各種データを音声出力
する音声出力デバイスとして機能する。また、各種デー
タを一時記憶するデータ記憶デバイスとして、前記ＲＡ
Ｍ５、前記ＨＤＤ６、前記ＦＤ７、等を有しており、予
め記録されたソフトウェアを前記ＣＰＵ２に提供できる
情報記憶媒体としては、前記ＲＯＭ４、前記ＲＡＭ５、
前記ＨＤＤ６、前記ＦＤ７、前記ＣＤ−ＲＯＭ９、等を
有している。Such a computer system accepts external inputs of various data by various methods. Therefore, the data input devices for receiving the data input include the drives 8, 9, the mouse 11, the keyboard 12, the communication I / F 14, and the like. , The microphone 15,
Etc. As the data output device for executing data output, the FDD 8, the display 13,
It has the communication I / F 14, the speaker 16, and the like. In particular, the speaker 16 functions as an audio output device that outputs various data as audio. Further, as the data storage device for temporarily storing various data, the RA
M5, the HDD 6, the FD 7, and the like, and the information storage medium that can provide pre-recorded software to the CPU 2 includes the ROM 4, the RAM 5,
It has the HDD 6, the FD 7, the CD-ROM 9, and the like.

【００２３】より詳細には、このコンピュータシステム
では、前記ＣＰＵ２に各種の処理動作を実行させるため
の制御プログラムがソフトウェアとして予め設定されて
おり、このような制御プログラムは、例えば、前記ＣＤ
−ＲＯＭ９に予め記録されている。このようなソフトウ
ェアは前記ＨＤＤ６（図示せず）に予めインストールさ
れており、前記コンピュータシステムの起動時に前記Ｒ
ＡＭ５に複写されて前記ＣＰＵ２に読み取られる。More specifically, in this computer system, a control program for causing the CPU 2 to execute various processing operations is set in advance as software.
-Recorded in the ROM 9 in advance. Such software is pre-installed in the HDD 6 (not shown), and is activated when the computer system is started.
It is copied to AM5 and read by the CPU2.

【００２４】このように前記ＣＰＵ２が各種のプログラ
ムを読み取って対応するデータ処理を実行することによ
り、各種機能が各種手段として実現されるので、このコ
ンピュータシステムが音声合成装置１として動作する。
本実施の形態の音声合成装置１は、上述のような各種手
段として、データ入力手段２１、データ記憶手段２２、
連文節分割手段２３、ポーズ設定手段２４、音声出力手
段２５、を有しており、前記連文節分割手段２３は、語
句分割手段である単語分割手段２６と連文節確定手段２
７からなる。前記単語分割手段２６は、その一部として
単語辞書２８を有しており、前記連文節確定手段２７
は、その一部として品詞列辞書２９を有している。As described above, various functions are realized as various means by the CPU 2 reading various programs and executing the corresponding data processing, so that the computer system operates as the speech synthesizer 1.
The speech synthesizer 1 of the present embodiment includes a data input unit 21, a data storage unit 22,
It has a continuous phrase dividing means 23, a pause setting means 24, and a voice output means 25. The continuous phrase dividing means 23 includes a word dividing means 26 which is a phrase dividing means and a continuous phrase determining means 2
Consists of seven. The word dividing means 26 has a word dictionary 28 as a part thereof,
Has a part-of-speech sequence dictionary 29 as a part thereof.

【００２５】前記単語辞書２８は、例えば、前記ＨＤＤ
６にデータファイルとして格納されており、語句である
多数の単語のテキストデータが品詞の情報とともに設定
されている。前記品詞列辞書２９も、前記ＨＤＤ６にデ
ータファイルとして格納されており、図４に示すよう
に、連文節を形成する品詞列の情報が設定されている。The word dictionary 28 includes, for example, the HDD
6 is stored as a data file, and text data of a large number of words, which are phrases, are set together with information on parts of speech. The part-of-speech sequence dictionary 29 is also stored as a data file in the HDD 6, and as shown in FIG. 4, information on a part-of-speech sequence forming a continuous phrase is set.

【００２６】前記データ入力手段２１は、前記ＲＡＭ５
のプログラムに対応した前記ＣＰＵ２の所定のデータ処
理により、前記キーボード１２等のデータ入力デバイス
に入力される日本語の自然言語のテキストデータを受け
付ける。以下同様に、前記ＲＡＭ５のプログラムに対応
した前記ＣＰＵ２のデータ処理により、前記データ記憶
手段２２は、入力されたテキストデータを前記ＲＡＭ５
の所定のワークエリアに格納する。The data input means 21 is provided in the RAM 5
With the predetermined data processing of the CPU 2 corresponding to the above program, text data of Japanese natural language input to a data input device such as the keyboard 12 is received. Similarly, by the data processing of the CPU 2 corresponding to the program of the RAM 5, the data storage unit 22 stores the input text data in the RAM 5.
In a predetermined work area.

【００２７】前記単語分割手段２６は、一時記憶された
テキストデータと前記単語辞書２８の設定内容とを照合
させ、単語毎に分割して各々の品詞を判定する。前記連
文節確定手段２７は、テキストデータを形成する単語の
品詞と前記品詞列辞書２９の設定内容とを照合させ、テ
キストデータを特定の品詞列からなる連文節毎に区切
る。つまり、これらの手段２６，２７からなる前記連文
節分割手段２３は、一時記憶されたテキストデータを連
文節毎に区切る。The word dividing means 26 compares the temporarily stored text data with the contents set in the word dictionary 28 and divides each word to determine each part of speech. The continuous phrase determining means 27 collates the part of speech of the word forming the text data with the setting contents of the part of speech dictionary 29, and divides the text data into each continuous phrase composed of a specific part of speech sequence. In other words, the continuous clause dividing means 23 composed of these means 26 and 27 divides the temporarily stored text data into continuous clauses.

【００２８】前記ポーズ設定手段２４は、連文節間の区
切位置を音声出力でのポーズ位置としてテキストデータ
に設定し、前記音声出力手段２５は、テキストデータを
ポーズ位置で所定時間ずつ区切りながら前記スピーカ１
６に音声出力させる。この場合、前記スピーカ１６を前
記ＣＰＵ２が直接に駆動するわけではなく、このＣＰＵ
２はテキストデータをポーズ位置で区切られた所定方式
の音声信号に変換し、この音声信号を前記スピーカ１６
のドライバ回路（図示せず）に入力する。The pause setting means 24 sets a break position between consecutive phrases in the text data as a pause position in voice output, and the voice output means 25 controls the speaker 1 while separating the text data by the pause position by a predetermined time.
6. Output sound. In this case, the speaker 16 is not directly driven by the CPU 2,
2 converts the text data into an audio signal of a predetermined system separated by a pause position, and converts the audio signal to the speaker 16.
To a driver circuit (not shown).

【００２９】上述した音声合成装置１の各種手段は、必
要により前記キーボード１２や前記スピーカ１６等のハ
ードウェアを利用するが、その主体は前記ＲＡＭ５等に
記録されたソフトウェアに対応して前記ＣＰＵ２が動作
することにより実現されている。つまり、前記ＲＡＭ５
には、前記ＣＰＵ２が読取自在なソフトウェアからなる
前記単語辞書２８および前記品詞列辞書２９と、前記Ｃ
ＰＵ２を前記各種手段２１〜２７として機能させるため
のプログラムとが記録されている。The various means of the voice synthesizing apparatus 1 use hardware such as the keyboard 12 and the speaker 16 as necessary. The main component is the CPU 2 corresponding to the software recorded in the RAM 5 or the like. It is realized by operating. That is, the RAM 5
The word dictionary 28 and the part-of-speech sequence dictionary 29 composed of software readable by the CPU 2
A program for causing the PU 2 to function as the various units 21 to 27 is recorded.

【００３０】より詳細には、前記キーボード１２等のデ
ータ入力デバイスに入力される日本語の自然言語のテキ
ストデータを受け付けること、入力されたテキストデー
タを前記ＲＡＭ５等のデータ記憶デバイスに一時記憶さ
せること、一時記憶されたテキストデータと前記単語辞
書２８の設定内容とを照合させること、この照合により
テキストデータを単語毎に分割して各々の品詞を判定す
ること、テキストデータを形成する単語の品詞と前記品
詞列辞書２９の設定内容とを照合させること、この照合
によりテキストデータを特定の品詞列からなる連文節毎
に区切ること、この連文節間の区切位置を音声出力での
テキストデータのポーズ位置として前記ＲＡＭ５に設定
すること、このポーズ位置でテキストデータを所定時間
ずつ区切りながら前記スピーカ１６に音声出力させるこ
と、を前記ＣＰＵ２に実行させるためのプログラムが前
記ＲＡＭ５に記録されている。More specifically, accepting Japanese natural language text data input to the data input device such as the keyboard 12 and temporarily storing the input text data in the data storage device such as the RAM 5. Making the text data stored temporarily match the setting contents of the word dictionary 28; dividing the text data for each word by this collation to determine each part of speech; The content of the part-of-speech sequence dictionary 29 is collated with the text data, and the collation is used to divide the text data into consecutive phrases composed of a specific part-of-speech sequence. It is necessary to set in the RAM 5 while separating the text data at this pause position by a predetermined time. Be an audio output to the serial speaker 16, is a program to be executed by the CPU2 is recorded in the RAM 5.

【００３１】このような構成において、本実施の形態の
音声合成装置１は、日本語の自然言語のテキストデータ
が入力されると、このテキストデータを自然な位置で区
切りながら音声出力することができる。この音声合成装
置１の音声合成方法を、図５のフローチャートを参考に
以下に順次説明する。In such a configuration, when text data of a natural language of Japanese is input, the speech synthesizer 1 of the present embodiment can output voice while separating the text data at natural positions. . The voice synthesizing method of the voice synthesizing device 1 will be sequentially described below with reference to the flowchart of FIG.

【００３２】まず、キーボード１２等のデータ入力デバ
イスにより、日本語の自然言語のテキストデータが入力
されると、このテキストデータはＣＰＵ２によりＲＡＭ
５の所定エリアに記録される。つぎに、この一時記憶さ
れたテキストデータと単語辞書２８の設定内容とが照合
され、この照合によりテキストデータが単語毎に分割さ
れて各々の品詞が判定される。First, when text data of Japanese natural language is input by a data input device such as the keyboard 12, the text data is stored in the RAM by the CPU 2.
5 in a predetermined area. Next, the temporarily stored text data is collated with the set contents of the word dictionary 28, and the text data is divided for each word by this collation, and each part of speech is determined.

【００３３】このようにテキストデータを形成する単語
の品詞が判定されると、これと品詞列辞書２９の設定内
容とが照合され、この照合によりテキストデータが特定
の品詞列からなる連文節毎に区切られる。この連文節間
の区切位置が音声出力でのポーズ位置としてＲＡＭ５の
テキストデータに設定され、このテキストデータがポー
ズ位置で所定時間ずつ区切られながらスピーカ１６によ
り音声出力される。When the part of speech of the word forming the text data is determined in this way, this is compared with the set contents of the part of speech dictionary 29, and the text data is divided into consecutive phrases composed of a specific part of speech string by this comparison. Can be The break position between the continuous phrases is set in the text data of the RAM 5 as a pause position in voice output, and the text data is voice-output by the speaker 16 while being separated at the pause position by a predetermined time.

【００３４】例えば、“担任の先生より若い事務員が赴
任した。”なるテキストデータが入力された場合、これ
は“担任／の／先生／より／若い／事務員／が／赴任／
し／た／。”なる単語に分割される。つぎに、“担任の
先生より若い／事務員が／赴任した。”なる連文節に区
切られ、“担任の先生より若い(Ｐ)事務員が(Ｐ)赴任し
た。”なるポーズ位置(Ｐ)が設定される。For example, when text data of “a younger clerk has been assigned than the teacher in charge” has been input, this is represented by “teacher / no / teacher / more / younger / clerk // assigned /
did/. Next, "I am younger than my homeroom teacher / a clerk / has been assigned. "A younger (P) clerk was assigned to (P) than the teacher in charge. Is set.

【００３５】本実施の形態の音声合成装置１では、上述
のようにテキストデータを連文節毎に区切って音声出力
でのポーズ位置を設定するので、構文解析等の複雑なデ
ータ処理を要することなくポーズ位置を適切に設定する
ことができ、簡単なデータ処理でテキストデータを自然
に音声出力することができる。In the speech synthesizing apparatus 1 of the present embodiment, as described above, the pause position in the speech output is set by dividing the text data for each continuous clause, so that the pause is performed without complicated data processing such as syntax analysis. The position can be set appropriately, and text data can be naturally output as voice by simple data processing.

【００３６】特に、連文節分割手段２３が単語辞書２８
と品詞列辞書２９とを有しており、テキストデータを単
語毎に分割してから品詞列と照合させて連文節毎に区切
るので、既存の単純なデータ処理でテキストデータを連
文節毎に区切ることができ、その精度や性能を辞書２
８，２９の設定内容により調節することも容易である。
なお、本発明は上記形態に限定されるものではなく、各
種の変形を許容する。例えば、上記形態では辞書２８，
２９を有する連文節分割手段２３がテキストデータを単
語毎に分割してから連文節毎に区切ることを例示した
が、例えば、辞書２８，２９を統合したような辞書を作
成しておき、テキストデータを単語毎に分割することな
く連文節毎に直接的に区切ることも可能である。In particular, the continuous phrase segmentation means 23 converts the word dictionary 28
And the part-of-speech sequence dictionary 29, which divides the text data into words and then collates them with the part-of-speech sequence to divide them into continuous clauses, so that existing simple data processing can divide the text data into continuous clauses. Dictionary 2
It is easy to adjust according to the setting contents of 8, 29.
Note that the present invention is not limited to the above embodiment, and allows various modifications. For example, in the above embodiment, the dictionary 28,
29, the text data is divided for each word and then divided for each continuous clause. For example, a dictionary in which the dictionaries 28 and 29 are integrated is created, and the text data is converted to a word. It is also possible to directly divide each continuous clause without dividing it for each segment.

【００３７】また、辞書２８，２９を有しない連文節分
割手段に連文節の形成に関する所定の文法規則を設定し
ておき、この文法規則に基づいてテキストデータから連
文節の候補を検出させ、この候補から最尤解を選択させ
ることによりテキストデータを連文節毎に区切ることも
可能である。この場合、辞書２８，２９を要しないので
ソフトウェアの規模を縮小することができ、データ処理
の負担を軽減して速度を向上させることができ、より高
精度にテキストデータを連文節毎に区切ることが可能で
ある。Further, a predetermined grammatical rule relating to the formation of a continuous phrase is set in the continuous phrase dividing means having no dictionaries 28 and 29, and based on the grammatical rule, a continuous phrase candidate is detected from the text data. By selecting a likelihood solution, it is also possible to divide the text data into successive phrases. In this case, since the dictionaries 28 and 29 are not required, the scale of the software can be reduced, the load of data processing can be reduced, and the speed can be improved. It is possible.

【００３８】さらに、本実施の形態では、ＲＡＭ５等に
ソフトウェアとして記録されているプログラムに従って
ＣＰＵ２がデータ処理を実行することにより、音声合成
装置１の各種手段が実現されることを例示した。しか
し、このような各種手段の各々を固有のハードウェアと
して製作することも可能であり、一部をソフトウェアと
してＲＡＭ５等に記録するとともに一部をハードウェア
として製作することも可能である。また、所定のソフト
ウェアが記録されたＲＡＭ５等や各部のハードウェア
を、例えば、ファームウェアとして製作することも可能
である。Furthermore, in the present embodiment, it has been exemplified that various means of the speech synthesizer 1 are realized by the CPU 2 executing data processing according to a program recorded as software in the RAM 5 or the like. However, it is also possible to manufacture each of the various means as unique hardware, and it is also possible to record a part of the means as software in the RAM 5 or the like and manufacture a part of the means as hardware. Further, the RAM 5 or the like in which predetermined software is recorded and hardware of each unit can be manufactured as, for example, firmware.

【００３９】また、本実施の形態では、ソフトウェアが
ＣＤ−ＲＯＭ９からＨＤＤ６にインストールされてＲＡ
Ｍ５に複写され、このＲＡＭ５からＣＰＵ２が読み取る
ことを例示したが、このようにソフトウェアをＣＰＵ２
に提供する情報記憶媒体は、ＣＰＵ２がアクセスできる
ものであれば良い。例えば、このようなソフトウェアを
ＣＤ−ＲＯＭ９等からＣＰＵ２に利用させることや、予
めＲＯＭ４に固定的に記録しておくことも可能であり、
複数の情報記憶媒体に分散させておくことも可能であ
る。In this embodiment, software is installed from the CD-ROM 9 to the HDD 6 and
M5, and the CPU 2 reads from the RAM 5 as an example.
The information storage medium provided to the user may be any one that can be accessed by the CPU 2. For example, such software can be used by the CPU 2 from the CD-ROM 9 or the like, or can be fixedly recorded in the ROM 4 in advance.
It is also possible to disperse them in a plurality of information storage media.

【００４０】また、このような音声合成装置１の各種手
段を実現するためのプログラムを、複数のソフトウェア
の組み合わせにより実現することも可能であり、その場
合、単体の製品となる情報記憶媒体には必要最小限のソ
フトウェアのみを記録しておけば良い。例えば、オペレ
ーティングシステムが実装されているコンピュータシス
テムに、ＣＤ−ＲＯＭ９等の情報記憶媒体によりアプリ
ケーションソフトを提供するような場合、音声合成装置
１の各種手段を実現するためのソフトウェアは、アプリ
ケーションソフトとオペレーティングシステムとの組み
合わせで実現されるので、オペレーティングシステムに
依存する部分のソフトウェアはアプリケーションソフト
の情報記憶媒体から省略することができる。It is also possible to realize a program for realizing the various means of the speech synthesizer 1 by a combination of a plurality of software. In this case, the information storage medium as a single product is Only the minimum required software needs to be recorded. For example, in a case where application software is provided to a computer system on which an operating system is mounted by using an information storage medium such as a CD-ROM 9, software for implementing various units of the speech synthesizer 1 includes the application software and the operating system. Since the software is realized in combination with the system, the software depending on the operating system can be omitted from the information storage medium of the application software.

【００４１】さらに、ここでは音声合成装置１に必要な
全部のソフトウェアがＲＡＭ５に格納されているが、例
えば、既存の音声合成装置（図示せず）のソフトウェア
の一部を本実施の形態の音声合成装置１のソフトウェア
と置換することにより、その音声合成装置を本実施の形
態の音声合成装置１として機能させることも可能であ
る。その場合、ＣＤ−ＲＯＭ９等の情報記憶媒体には、
テキストデータを連文節毎に区切ること、連文節間の区
切位置に基づいて音声出力でのポーズ位置をテキストデ
ータに設定すること、のプログラムのみ記録しておけば
良く、このプログラムを既存の音声合成装置のプログラ
ムの対応する部分に置換させれば良い。Further, here, all the software necessary for the speech synthesizer 1 is stored in the RAM 5. For example, a part of the software of the existing speech synthesizer (not shown) may be replaced with the speech of the present embodiment. By replacing the software of the synthesizing device 1, the voice synthesizing device can also function as the voice synthesizing device 1 of the present embodiment. In that case, the information storage medium such as the CD-ROM 9
It is only necessary to record a program that separates text data for each continuous clause, and sets a pause position in voice output to text data based on a break position between continuous clauses. What is necessary is just to replace with the corresponding part of a program.

【００４２】また、このように情報記憶媒体に記録した
ソフトウェアをコンピュータに供給する手法は、その情
報記憶媒体をコンピュータに直接に装填することに限定
されない。例えば、上述のようなソフトウェアをホスト
コンピュータの情報記憶媒体に記録し、このホストコン
ピュータを通信ネットワークにより端末コンピュータに
接続し、ホストコンピュータからデータ通信により端末
コンピュータにソフトウェアを供給することも可能であ
る。The method of supplying the software recorded on the information storage medium to the computer is not limited to loading the information storage medium directly into the computer. For example, it is also possible to record the software as described above in an information storage medium of a host computer, connect the host computer to a terminal computer via a communication network, and supply the software to the terminal computer by data communication from the host computer.

【００４３】この場合、端末コンピュータが自身の情報
記憶媒体にソフトウェアをダウンロードした状態でスタ
ンドアロンのデータ処理を実行することも可能である
が、ソフトウェアをダウンロードすることなくホストコ
ンピュータとのリアルタイムのデータ通信によりデータ
処理を実行することも可能である。この場合、ホストコ
ンピュータと端末コンピュータとを通信ネットワークに
より接続したシステム全体が、本発明の音声合成装置１
に相当することになる。In this case, it is possible for the terminal computer to execute stand-alone data processing in a state where the software has been downloaded to its own information storage medium, but it is possible to perform real-time data communication with the host computer without downloading the software. It is also possible to perform data processing. In this case, the entire system in which the host computer and the terminal computer are connected by the communication network is the speech synthesis device 1
Would be equivalent to

【００４４】つぎに、本発明の実施の第二の形態を図６
ないし図８を参考に以下に簡単に説明する。なお、本実
施の形態に関し、上述した第一の形態と同一の部分は、
同一の名称および符号を使用して詳細な説明は省略す
る。Next, a second embodiment of the present invention will be described with reference to FIG.
This will be briefly described below with reference to FIG. In addition, regarding this embodiment, the same parts as the above-described first embodiment are:
Detailed description is omitted using the same name and reference numeral.

【００４５】まず、本実施の形態の音声合成装置３１
は、そのハードウェアは前述の音声合成装置１と同一で
あり、この音声合成装置１とはソフトウェアの一部が相
違している。このため、本実施の形態の音声合成装置３
１は、図６に示すように、大部分の手段２２〜２７は前
述の音声合成装置１と同一であるが、ここに役割判定手
段３２が新規に設けられており、この役割判定手段３２
は、その一部として連文節機能辞書３３を有している。First, the speech synthesizer 31 according to the present embodiment
The hardware is the same as that of the above-described speech synthesizer 1, and the software of the speech synthesizer 1 is partially different. For this reason, the speech synthesizer 3 of the present embodiment
6, most of the means 22 to 27 are the same as those of the above-described speech synthesizer 1 as shown in FIG. 6, but a role determining means 32 is newly provided here.
Has a continuous phrase function dictionary 33 as a part thereof.

【００４６】この連文節機能辞書３３は、例えば、ソフ
トウェアであるデータファイルとして前記ＨＤＤ６に格
納されており、図７に示すように、連文節の文法的な機
能が設定されている。前記役割判定手段３２は、前記Ｒ
ＡＭ９のプログラムに対応した前記ＣＰＵ２の所定のデ
ータ処理により、連文節分割手段２３によりテキストデ
ータから検出された連文節と連文節機能辞書３３の設定
内容とを照合させ、テキストデータの連文節の各々の文
法的な機能を個々に判定する。これに対応してポーズ設
定手段２４の機能も一部変更されており、このポーズ設
定手段２４は、テキストデータの連文節毎の区切位置を
直接的にポーズ位置として設定せず、連文節の機能に基
づいて取捨選択する。The continuous phrase function dictionary 33 is stored in the HDD 6 as a data file as software, for example, and grammatical functions of the continuous phrase are set as shown in FIG. The role determining means 32 determines whether the R
By the predetermined data processing of the CPU 2 corresponding to the program of AM9, the continuous phrase detected from the text data by the continuous phrase dividing means 23 is collated with the setting contents of the continuous phrase function dictionary 33, and the grammatical expression of each of the continuous phrases of the text data is checked. Determine the function individually. In response to this, the function of the pause setting means 24 has also been partially changed, and the pause setting means 24 does not directly set the break position of each text segment in the text data as a pause position, but uses the function of the continuous phrase. To choose.

【００４７】上述した音声合成装置３１の各種手段も、
その主体はＲＡＭ５等に記録されたソフトウェアに対応
してＣＰＵ２が動作することにより実現されているの
で、前記ＲＡＭ５には、ＣＰＵ２が読取自在なソフトウ
ェアからなる前記連文節機能辞書３３と、ＣＰＵ２を各
種手段として機能させるためのプログラムとが新規に記
録されている。The various means of the speech synthesizer 31 described above also
Since the main body is realized by the operation of the CPU 2 corresponding to the software recorded in the RAM 5 or the like, the RAM 5 includes the continuous phrase function dictionary 33 composed of software readable by the CPU 2 and the CPU 2. And a program for causing it to function as a new program.

【００４８】より詳細には、テキストデータから検出さ
れた連文節と前記連文節機能辞書３３の設定内容とを照
合させること、この照合によりテキストデータの連文節
の各々の文法的な機能を個々に判定すること、この判定
結果に基づいてテキストデータの連文節毎の区切位置を
取捨選択してポーズ位置を設定すること、を前記ＣＰＵ
２に実行させるためのプログラムが前記ＲＡＭ５に新規
に記録されている。More specifically, collating the continuous phrase detected from the text data with the set contents of the continuous phrase function dictionary 33, and individually determining the grammatical function of each continuous phrase of the text data by this collation. Setting a pause position by selecting a break position of each text segment based on the determination result and setting a pause position.
2 is newly recorded in the RAM 5.

【００４９】このような構成において、本実施の形態の
音声合成装置３１も、前述した音声合成装置１と同様
に、日本語の自然言語のテキストデータが入力される
と、このテキストデータを自然な位置で区切りながら音
声出力することができる。この音声合成装置３１の音声
合成方法を、図８のフローチャートを参考に以下に簡単
に説明する。In such a configuration, the speech synthesizer 31 of the present embodiment, when the text data of the Japanese natural language is input, similarly to the speech synthesizer 1 described above, You can output audio while separating by position. The speech synthesis method of the speech synthesis device 31 will be briefly described below with reference to the flowchart of FIG.

【００５０】まず、入力される日本語の自然言語のテキ
ストデータが一時記憶され、これが単語毎に分割されて
から連文節毎に区切られる。つぎに、このようにテキス
トデータから検出された連文節と連文節機能辞書３３の
設定内容とが照合され、この照合によりテキストデータ
の連文節の各々の文法的な機能が個々に判定される。こ
の判定結果に基づいてテキストデータの連文節毎の区切
位置が取捨選択されてポーズ位置が設定されるので、こ
のテキストデータがポーズ位置で所定時間ずつ区切られ
ながら音声出力される。First, the text data of the input Japanese natural language is temporarily stored, is divided into words, and is divided into continuous phrases. Next, the continuous phrase detected from the text data is compared with the setting contents of the continuous phrase function dictionary 33, and the grammatical function of each continuous phrase of the text data is individually determined by the comparison. Based on this determination result, the break position of each text segment is selected and the pause position is set, so that the text data is output as audio while being separated at the pause position by a predetermined time.

【００５１】例えば、“担任の先生より若い事務員が赴
任した。”なるテキストデータは“担任の先生より若い
／事務員が／赴任した。”なる連文節に区切られてか
ら、“担任の先生より若い(連体修飾節)／事務員が(主
語節)／赴任した。(述語節)”として連文節の文法的な
機能が判定される。この場合は全部の区切位置がポーズ
位置として適切なので、“担任の先生より若い(Ｐ)事務
員が(Ｐ)赴任した。(Ｐ)”なるポーズ位置(Ｐ)がテキス
トデータに設定される。なお、上記例文の最後のポーズ
位置は、連文節とは無関係に文末に設定される。For example, the text data “The younger clerk has been assigned than the homeroom teacher” is divided into a series of phrases “Your younger than the homeroom teacher / The clerk has been assigned.” The grammatical function of the continuous clause is determined as "young (adjunct modifier clause) / clerk is assigned (subject clause) / assigned. (Predicate clause)." In this case, since all the break positions are appropriate as the pause positions, the pause position (P) of “(P), a clerk who is younger than the teacher in charge (P) has been assigned, is set in the text data. The last pause position in the above example sentence is set at the end of the sentence irrespective of the continuous phrase.

【００５２】本実施の形態の音声合成装置３１では、上
述のようにテキストデータの連文節毎の句切位置を音声
出力でのポーズ位置として直接に設定せず、連文節の文
法的な機能に基づいて取捨選択するので、より適切にポ
ーズ位置を設定してテキストデータを自然に音声出力す
ることができる。In the speech synthesizer 31 of the present embodiment, as described above, the punctuation position for each continuous phrase of text data is not directly set as a pause position in voice output, but is based on the grammatical function of the continuous phrase. Since the selection is made, the pause position can be set more appropriately and the text data can be naturally output as voice.

【００５３】特に、役割判定手段３２が連文節機能辞書
３３を有しており、その設定内容に対応してテキストデ
ータの連文節の機能を判定するので、既存の単純なデー
タ処理で連文節の機能を判定することができ、その精度
や性能を連文節機能辞書３３の設定内容により調節する
ことも容易である。In particular, since the role determining means 32 has the continuous phrase function dictionary 33 and determines the function of the continuous phrase of the text data according to the set contents, the function of the continuous phrase is determined by the existing simple data processing. It is easy to adjust the accuracy and performance by the setting contents of the continuous phrase function dictionary 33.

【００５４】なお、本発明は上記形態に限定されるもの
でもなく、各種の変形を許容する。例えば、上記形態で
は連文節機能辞書３３を有する役割判定手段３２がテキ
ストデータの連文節の文法的な機能を判定することを例
示したが、連文節機能辞書３３を有しない役割判定手段
に連文節内の係り受け関係に関する所定の文法規則を設
定しておき、この文法規則に基づいて連文節の機能を判
定させることも可能である。この場合、連文節機能辞書
３３を要しないのでソフトウェアの規模を縮小すること
ができ、データ処理の負担を軽減して速度を向上させる
ことができ、より高精度にテキストデータを連文節毎に
区切ることが可能である。The present invention is not limited to the above-described embodiment, but allows various modifications. For example, in the above-described embodiment, the role determining unit 32 having the continuous phrase function dictionary 33 determines that the grammatical function of the continuous phrase of the text data is determined. It is also possible to set a predetermined grammatical rule relating to the relationship and determine the function of the continuous phrase based on the grammatical rule. In this case, since the continuous phrase function dictionary 33 is not required, the scale of the software can be reduced, the load of data processing can be reduced, the speed can be improved, and the text data can be more precisely divided into continuous phrases. It is possible.

【００５５】つぎに、本発明の実施の第三の形態を図９
および図１０を参考に以下に簡単に説明する。なお、本
実施の形態に関し、上述した第二の形態と同一の部分
は、同一の名称および符号を使用して詳細な説明は省略
する。Next, a third embodiment of the present invention will be described with reference to FIG.
This will be briefly described below with reference to FIGS. In this embodiment, the same portions as those in the above-described second embodiment are denoted by the same names and reference numerals, and detailed description is omitted.

【００５６】まず、本実施の形態の音声合成装置４１
も、ＣＰＵ２が読み取って対応するデータ処理を実行す
るＲＡＭ５のソフトウェアの一部が変更されることによ
り、図９に示すように、情報記憶手段４２、関係判定手
段４３、時間調節手段４４、等が新規に設けられてい
る。First, the speech synthesizer 41 of the present embodiment
Also, as shown in FIG. 9, the information storage unit 42, the relationship determination unit 43, the time adjustment unit 44, and the like are changed by changing a part of the software of the RAM 5 that the CPU 2 reads and executes the corresponding data processing. It is newly provided.

【００５７】前記情報記憶手段４２は、テキストデータ
の連文節毎の区切位置、連文節の順番、連文節の機能、
等の各種情報を、ＲＡＭ５の所定のワークエリアに格納
する。前記関係判定手段４３は、このように一時記憶さ
れた各種情報に基づいて連文節間の関係を判定し、前記
時間調節手段４４は、判定された関係に基づいてポーズ
位置でのポーズ時間を個々に調節する。The information storage means 42 includes a delimiter position for each continuous clause of the text data, an order of the continuous clause, a function of the continuous clause,
Are stored in a predetermined work area of the RAM 5. The relation determining means 43 determines the relation between consecutive phrases based on the various kinds of information temporarily stored as described above, and the time adjusting means 44 individually sets the pause time at the pause position based on the determined relation. Adjust.

【００５８】より詳細には、連文節毎の区切位置、連文
節の順番、連文節の機能、の各種の組み合わせに対して
一つのポーズ時間が設定された時間テーブル（図示せ
ず）が設定されており、この時間テーブルから読み出さ
れたポーズ時間は、例えば、ポーズ設定手段２４がテキ
ストデータに設定するポーズ位置の情報に属性として設
定される。これに対応して音声出力手段２５も一部変更
されており、この音声出力手段２５は、テキストデータ
をポーズ位置でポーズ時間だけ区切って音声出力する。More specifically, a time table (not shown) in which one pause time is set for various combinations of a break position for each continuous phrase, an order of the continuous phrase, and a function of the continuous phrase is set. The pause time read from the time table is set as an attribute in the information on the pause position set in the text data by the pause setting unit 24, for example. In response to this, the voice output means 25 has also been partially changed, and the voice output means 25 voice-outputs the text data at the pause position by a pause time.

【００５９】従って、本実施の形態の音声合成装置４１
のＲＡＭ５には、連文節確定手段２７により検出される
テキストデータの連文節毎の区切位置と連文節の順番と
をＲＡＭ５の所定エリアに格納すること、役割判定手段
３２により検出される連文節の機能をＲＡＭ５の所定エ
リアに格納すること、このようにＲＡＭ５に一時記憶さ
れた各種情報に基づいて連文節間の関係を判定するこ
と、この判定された関係に基づいてポーズ位置でのポー
ズ時間を個々に調節すること、このテキストデータのス
ピーカ１６による音声出力をポーズ位置でポーズ時間だ
け区切ること、を前記ＣＰＵ２に実行させるためのプロ
グラムが新規に記録されている。Therefore, the speech synthesizing device 41 of the present embodiment
The RAM 5 stores the delimiter positions of the text data detected by the continuous phrase determining means 27 and the order of the continuous phrases in a predetermined area of the RAM 5, and stores the function of the continuous phrase detected by the role determining means 32 in the RAM 5. Storing in a predetermined area, determining the relationship between consecutive phrases based on the various information temporarily stored in the RAM 5, and individually adjusting the pause time at the pause position based on the determined relationship. A program for causing the CPU 2 to execute a process of dividing the audio output of the text data by the speaker 16 by the pause time at the pause position is newly recorded.

【００６０】このような構成において、本実施の形態の
音声合成装置４１も、前述した音声合成装置３１と同様
に、日本語の自然言語のテキストデータが入力される
と、このテキストデータを自然な位置で区切りながら音
声出力することができる。この音声合成装置４１の音声
合成方法を、図１０のフローチャートを参考に以下に簡
単に説明する。In such a configuration, the speech synthesizer 41 of the present embodiment, when the text data of the Japanese natural language is input, similarly to the speech synthesizer 31 described above, You can output audio while separating by position. The speech synthesis method of the speech synthesis device 41 will be briefly described below with reference to the flowchart of FIG.

【００６１】まず、入力される日本語の自然言語のテキ
ストデータが一時記憶され、これが単語毎に分割されて
から連文節毎に区切られ、この連文節間の区切位置と連
文節の順番との情報がＲＡＭ５に一時記憶される。つぎ
に、テキストデータから検出された連文節の各々の文法
的な機能が個々に判定され、この連文節の機能の情報も
ＲＡＭ５に一時記憶され、連文節の機能の判定結果に基
づいてテキストデータのポーズ位置が設定される。First, the text data of the input Japanese natural language is temporarily stored, divided into words, and then divided into successive phrases. The RAM 5 stores information on the position of the break between consecutive phrases and the order of the consecutive phrases. Is temporarily stored. Next, the grammatical function of each of the continuous phrases detected from the text data is individually determined, and information on the function of the continuous phrase is also temporarily stored in the RAM 5, and the pause position of the text data is determined based on the determination result of the continuous phrase function. Is set.

【００６２】そして、上述のように一時記憶された連文
節の区切位置と順番と機能との情報が読み出され、これ
らの各種情報に基づいて連文節間の関係が判定される。
この判定された関係に基づいてポーズ位置でのポーズ時
間が個々に設定され、テキストデータはポーズ位置でポ
ーズ時間だけ区切られながら音声出力される。Then, the information on the delimiter position, the order, and the function of the consecutive phrases temporarily stored as described above is read, and the relationship between the consecutive phrases is determined based on these various information.
The pause time at the pause position is individually set based on the determined relationship, and the text data is output as audio while being separated by the pause time at the pause position.

【００６３】例えば、前述した“担任の先生より若い事
務員が赴任した。”なるテキストデータには“担任の先
生より若い(Ｐ)事務員が(Ｐ)赴任した。(Ｐ)”なるポー
ズ位置(Ｐ)が設定され、これらのポーズ位置にポーズ時
間の属性情報が“担任の先生より若い(Ｐ１)事務員が
(Ｐ２)赴任した。(Ｐ５)”として設定される。なお、こ
こでは(Ｐｘ)の“ｘ”がポーズ時間の相対長さを示して
おり、文末の(Ｐ５)は連文節間の関係とは無関係に最長
のポーズ時間で設定される。For example, in the above-described text data of “a clerk younger than the teacher in charge has been assigned.” The pause position “(P) an clerk younger than the teacher in charge has been assigned. (P)”. (P) is set, and the attribute information of the pause time is set in these pose positions as “Person who is younger than the teacher in charge (P1).
(P2) I was assigned. (P5) ". Here," x "in (Px) indicates the relative length of the pause time, and (P5) at the end of the sentence indicates the longest pause time regardless of the relationship between consecutive clauses. Is set by

【００６４】本実施の形態の音声合成装置４１では、上
述のようにテキストデータの連文節の各種情報に基づい
てポーズ位置でのポーズ時間を個々に調節するので、よ
り自然にテキストデータを音声出力することができる。In the speech synthesizing device 41 of the present embodiment, the pause time at the pause position is individually adjusted based on the various information of the continuous clause of the text data as described above, so that the text data is more naturally output as speech. be able to.

【００６５】特に、テキストデータから検出された連文
節の各種情報に基づいて関係判定手段４３が連文節間の
関係を判定し、この関係に基づいて前記時間調節手段４
４がポーズ位置でのポーズ時間を個々に調節するので、
このポーズ時間はテキストデータの各種の連文節の関係
に基づいて適切に設定される。つまり、テキストデータ
が連文節ＡＢＣ…からなる場合、“Ａ−Ｂ”の関係と
“Ａ−Ｃ”の関係とに基づいて“Ｂ／Ｃ”間のポーズ時
間を設定するようなことも可能であり、テキストデータ
から検出される多数の連文節の様々な関係に基づいてポ
ーズ時間を適切に設定することができる。In particular, the relation determining means 43 determines the relation between the continuous phrases based on various information of the continuous phrases detected from the text data, and based on this relation, the time adjusting means 4
4 individually adjusts the pause time at the pause position,
This pause time is appropriately set based on the relationship between various successive phrases in the text data. That is, when the text data is composed of continuous phrases ABC..., The pause time between “B / C” can be set based on the relationship “AB” and the relationship “AC”. The pause time can be appropriately set based on various relationships between a large number of consecutive phrases detected from text data.

【００６６】なお、本発明は上記形態に限定されるもの
でもなく、各種の変形を許容する。例えば、上記形態で
は時間調節手段４４がテキストデータの連文節間の関係
のみに基づいてポーズ時間を調節することを例示した
が、図１１に示すように、隣接する連文節の機能の接続
尤度が設定された接続尤度辞書４５を時間調節手段に設
け、この時間調節手段に接続尤度にも対応してポーズ時
間を個々に調節させることも可能である。隣接する連文
節の機能の接続尤度も考慮してポーズ時間が調節される
ので、隣接する連文節間のポーズ時間を単純なデータ処
理で設定することができ、その精度や性能を接続尤度辞
書４５の設定内容により調節することも容易である。The present invention is not limited to the above-described embodiment, but allows various modifications. For example, in the above-described embodiment, the time adjustment unit 44 adjusts the pause time based on only the relation between the continuous clauses of the text data. However, as shown in FIG. 11, the connection likelihood of the function of the adjacent continuous clause is set. It is also possible to provide the connection likelihood dictionary 45 provided in the time adjusting means, and to have the time adjusting means individually adjust the pause time according to the connection likelihood. Since the pause time is adjusted in consideration of the connection likelihood of the function of the adjacent continuous clause, the pause time between the adjacent continuous clauses can be set by simple data processing, and the accuracy and performance of the pause time are determined by the connection likelihood dictionary 45. It is also easy to adjust according to the setting contents.

【００６７】また、上記形態ではポーズ位置を連文節間
の位置に基づいて設定することを例示したが、例えば、
テキストデータを所定の語句である文節毎に区切る語句
分割手段（図示せず）を設け、ポーズ設定手段２４に連
文節内の文節間の分割位置にもポーズ位置を設定させる
ことも可能であり、さらに、このような文節間のポーズ
時間より連文節間のポーズ時間が長くなるよう、時間調
節手段４４にポーズ時間を設定させることも可能であ
る。この場合、テキストデータの音声出力が、連文節間
で長く区切られるとともに連文節内の文節間では短く区
切られるので、より自然にテキストデータが音声出力さ
れる。In the above embodiment, the pause position is set based on the position between consecutive phrases.
It is also possible to provide a phrase division unit (not shown) for dividing text data into phrases that are predetermined phrases, and to cause the pause setting unit 24 to set a pause position also at a division position between phrases in a continuous phrase. It is also possible to cause the time adjusting means 44 to set the pause time so that the pause time between consecutive phrases is longer than the pause time between phrases. In this case, the voice output of the text data is long separated between the continuous phrases and short between the phrases within the continuous phrases, so that the text data is output more naturally as voice.

【００６８】例えば、前述した“担任の先生より若い事
務員が赴任した。”なるテキストデータに連文節間のポ
ーズ位置とポーズ時間とが“担任の先生より若い(Ｐ１)
事務員が(Ｐ２)赴任した。(Ｐ５)”として設定された場
合、これに文節間のポーズ位置とポーズ時間とが“担任
の(Ｐ１)先生より(Ｐ１)若い(Ｐ２)事務員が(Ｐ３)赴任
した。(Ｐ５)”として設定される。つまり、連文節内の
文節間に(Ｐ１)のポーズ時間が付加されることにより、
連文節間のポーズ時間(Ｐ１)は相対的に(Ｐ２)に増加さ
れ、(Ｐ２)は(Ｐ３)に増加される。For example, in the above-described text data “A younger clerk has been assigned than the homeroom teacher”, the pause position and pause time between consecutive clauses are described as “Your younger than homeroom teacher (P1).
The clerk has been assigned to (P2). When (P5) "is set, the pause position and the pause time between the phrases are" (P1) younger (P1) than the teacher in charge (P1), and (P2) the clerk has been assigned to (P3). (P5) ". That is, by adding the pause time (P1) between the phrases in the continuous phrase,
The pause time (P1) between consecutive clauses is relatively increased to (P2), and (P2) is increased to (P3).

【００６９】[0069]

【発明の効果】請求項１記載の発明の音声合成装置は、
各種データの入力を受け付けるデータ入力デバイスと、
各種データを音声出力する音声出力デバイスと、各種デ
ータを一時記憶するデータ記憶デバイスと、前記データ
入力デバイスに入力される日本語の自然言語のテキスト
データを受け付けるデータ入力手段と、入力されたテキ
ストデータを前記データ記憶デバイスに一時記憶させる
データ記憶手段と、一時記憶されたテキストデータを連
文節毎に区切る連文節分割手段と、連文節間の区切位置
に基づいて音声出力でのポーズ位置をテキストデータに
設定するポーズ設定手段と、テキストデータをポーズ位
置で所定時間ずつ区切りながら前記音声出力デバイスに
音声出力させる音声出力手段とを有することにより、テ
キストデータの音声出力でのポーズ位置が連文節毎の区
切位置に基づいて設定されるので、簡単なデータ処理で
ポーズ位置を適正に設定することができ、テキストデー
タを自然な位置で区切りながら音声出力することができ
る。According to the first aspect of the present invention, there is provided a speech synthesizing apparatus.
A data input device that accepts input of various data;
A voice output device for outputting various data as voice, a data storage device for temporarily storing various data, data input means for receiving Japanese natural language text data input to the data input device, and input text data Data storage means for temporarily storing the text data in the data storage device, continuous phrase dividing means for partitioning the temporarily stored text data for each continuous phrase, and setting a pause position in voice output to text data based on a partition position between the continuous phrases. By providing a pause setting unit and a voice output unit for outputting a voice to the voice output device while separating text data at a pause position by a predetermined time, the pause position in the voice output of the text data is based on a break position for each continuous phrase. The pause position is set appropriately with simple data processing. Can be set, it is possible to audio output while delimited text data in a natural position.

【００７０】請求項２記載の発明の音声合成装置では、
テキストデータを形成する単語が品詞の情報とともに設
定された単語辞書を設け、連文節を形成する品詞列が設
定された品詞列辞書を設け、連文節分割手段は、単語辞
書の設定内容に従ってテキストデータを単語毎に分割し
て各々の品詞を判定し、このテキストデータの品詞と品
詞列辞書の設定内容とを照合させてテキストデータを連
文節毎に区切ることにより、単語辞書や品詞列辞書の設
定内容に基づいてテキストデータが連文節毎に区切られ
るので、このテキストデータを区切る処理を簡単に実行
することができ、この処理の精度や性能を単語辞書や品
詞列辞書の設定内容により調節することもできる。In the voice synthesizing apparatus according to the second aspect of the present invention,
A word dictionary in which words forming text data are set together with part-of-speech information is provided, a part-of-speech string dictionary in which part-of-speech strings forming continuous phrases are set, and the continuous phrase segmentation means converts the text data into words according to the settings in the word dictionary. Each part of speech is determined by dividing each part of speech, and the part of speech of this text data is collated with the settings of the part of speech part dictionary to divide the text data into continuous clauses, based on the settings of the word dictionary and part of speech part dictionary. Since the text data is separated for each continuous phrase, the processing for separating the text data can be easily executed, and the accuracy and performance of this processing can be adjusted by the setting contents of the word dictionary and the part-of-speech string dictionary.

【００７１】請求項３記載の発明の音声合成装置では、
連文節分割手段は、連文節の形成に関する所定の文法規
則に基づいてテキストデータから連文節の候補を検出
し、この候補から最尤解を選択して連文節を確定するこ
とにより、所定の文法規則に基づいてテキストデータを
連文節毎に区切ることができるので、大規模な辞書を要
することなくテキストデータを適切に区切ることができ
る。In the speech synthesizing apparatus according to the third aspect of the present invention,
The continuous phrase dividing means detects a continuous phrase candidate from the text data based on a predetermined grammatical rule related to the formation of the continuous phrase, selects the maximum likelihood solution from the candidates and determines the continuous phrase, and based on the predetermined grammatical rule, Since the text data can be separated for each continuous clause, the text data can be appropriately separated without requiring a large-scale dictionary.

【００７２】請求項４記載の発明の音声合成装置では、
テキストデータから検出された連文節の各々の文法的な
機能を個々に判定する役割判定手段を設け、ポーズ設定
手段は、判定された連文節の機能にも対応してポーズ位
置を設定することにより、連文節の文法的な機能にも対
応してテキストデータにポーズ位置が設定されるので、
より適切にポーズ位置を設定することができる。In the speech synthesizing apparatus according to the fourth aspect of the present invention,
Role determining means for individually determining each grammatical function of the continuous phrase detected from the text data is provided, and the pause setting means sets a pause position corresponding to the determined function of the continuous phrase. The pause position is set in the text data according to the grammatical function of
The pause position can be set more appropriately.

【００７３】請求項５記載の発明の音声合成装置では、
連文節の文法的な機能が設定された連文節機能辞書を設
け、役割判定手段は、テキストデータから検出された連
文節の機能を連文節機能辞書の設定内容に対応して判定
することにより、連文節機能辞書の設定内容に対応して
連文節の文法的な機能が判定されるので、この連文節の
機能判定の処理を簡単に実行することができ、この処理
の精度や性能を連文節機能辞書の設定内容により調節す
ることもできる。In the speech synthesizing apparatus according to the fifth aspect of the present invention,
A continuous phrase function dictionary in which the grammatical function of the continuous phrase is set is provided, and the role determining means determines the function of the continuous phrase detected from the text data in accordance with the setting contents of the continuous phrase function dictionary. Since the grammatical function of the continuous phrase is determined according to the setting contents, the process of determining the function of the continuous phrase can be easily executed, and the accuracy and performance of this process are adjusted by the settings in the continuous phrase function dictionary. You can also.

【００７４】請求項６記載の発明の音声合成装置では、
役割判定手段は、連文節内の係り受け関係に関する所定
の文法規則に基づいて連文節の機能を判定することによ
り、所定の文法規則に基づいて連文節の文法的な機能が
判定されるので、大規模な辞書を要することなく連文節
の機能を適切に判定することができる。In the speech synthesizing apparatus according to the sixth aspect of the present invention,
The role determining means determines a grammatical function of the continuous phrase based on a predetermined grammatical rule by determining a function of the continuous phrase based on a predetermined grammatical rule regarding a dependency relationship in the continuous phrase. The function of a continuous phrase can be appropriately determined without requiring a dictionary.

【００７５】請求項７記載の発明の音声合成装置では、
テキストデータでの区切位置と順番と機能とを少なくと
も含む連文節の各種情報をデータ記憶デバイスに一時記
憶させる情報記憶手段を設け、一時記憶された各種情報
に基づいて連文節間の関係を判定する関係判定手段を設
け、判定された関係に基づいてポーズ位置で音声出力を
区切るポーズ時間を個々に調節する時間調節手段を設け
たことにより、テキストデータから検出された連文節間
の関係に基づいて複数のポーズ位置のポーズ時間が個々
に調節されるので、より自然にテキストデータを音声出
力することができる。In the speech synthesizing apparatus according to the seventh aspect of the present invention,
An information storage means for temporarily storing in a data storage device various information of a continuous phrase including at least a break position, an order, and a function in text data, and a relationship determination for determining a relationship between the continuous phrases based on the temporarily stored various information Means, and a time adjusting means for individually adjusting a pause time at which a voice output is divided at a pause position based on the determined relationship, whereby a plurality of pauses are determined based on the relation between consecutive phrases detected from the text data. Since the pause time of the position is individually adjusted, text data can be output more naturally.

【００７６】請求項８記載の発明の音声合成装置では、
隣接する連文節の機能の接続尤度が設定された接続尤度
辞書を設け、時間調節手段は、接続尤度にも対応してポ
ーズ時間を個々に調節することにより、接続尤度辞書の
設定内容に基づいて隣接する連文節間のポーズ時間が調
節されるので、このポーズ時間を調節する処理を簡単に
実行することができ、この処理の精度や性能を接続尤度
辞書の設定内容により調節することもできる。In the speech synthesizing apparatus according to the present invention,
The connection likelihood dictionary in which the connection likelihood of the function of the adjacent clause is set is provided, and the time adjusting means individually adjusts the pause time corresponding to the connection likelihood, thereby setting the connection likelihood dictionary. Since the pause time between adjacent consecutive clauses is adjusted based on, the process of adjusting the pause time can be easily executed, and the accuracy and performance of this process can be adjusted by the setting contents of the connection likelihood dictionary. Can also.

【００７７】請求項９記載の音声合成装置では、テキス
トデータを所定の語句毎に区切る語句分割手段を設け、
ポーズ設定手段は、連文節内の語句間の分割位置にもポ
ーズ位置を設定し、時間調節手段は、語句間のポーズ時
間より連文節間のポーズ時間を長く設定することによ
り、テキストデータの連文節間に長いポーズ時間が設定
されるとともに、連文節内の語句間に短いポーズ時間が
設定されので、より自然にテキストデータを音声出力す
ることができる。According to a ninth aspect of the present invention, there is provided a speech synthesizing device, wherein a phrase dividing means for dividing text data for each predetermined phrase is provided.
The pause setting means sets a pause position also at a division position between phrases in the continuous phrase, and the time adjustment means sets a pause time between the continuous phrases longer than the pause time between the phrases, thereby providing a pause between the continuous phrases of the text data. Since a long pause time is set and a short pause time is set between words in a continuous phrase, text data can be output more naturally as voice.

【００７８】請求項１０記載の発明の音声合成方法は、
日本語の自然言語のテキストデータの入力を受け付け、
入力されたテキストデータを一時記憶し、一時記憶され
たテキストデータを連文節毎に区切り、連文節間の区切
位置に基づいて音声出力でのポーズ位置をテキストデー
タに設定し、このテキストデータをポーズ位置で所定時
間ずつ区切りながら音声出力するようにしたことによ
り、テキストデータの連文節毎の区切位置に音声出力で
のポーズ位置が設定されるので、簡単なデータ処理でポ
ーズ位置を適正に設定することができ、テキストデータ
を自然な位置で区切りながら音声出力することができ
る。The speech synthesizing method according to the tenth aspect of the present invention
Accepts input of text data in Japanese natural language,
The input text data is temporarily stored, the temporarily stored text data is separated for each continuous phrase, the pause position in the voice output is set to the text data based on the separation position between the continuous phrases, and this text data is stored in the pause position. Since the voice output is performed while being separated by the predetermined time, the pause position in the voice output is set at the break position of each continuous segment of the text data, so that the pause position can be appropriately set by simple data processing. In addition, voice output can be performed while separating text data at natural positions.

【００７９】請求項１１記載の発明の情報記憶媒体は、
コンピュータに、データ入力デバイスに入力される日本
語の自然言語のテキストデータを受け付けること、入力
されたテキストデータをデータ記憶デバイスに一時記憶
させること、一時記憶されたテキストデータを連文節毎
に区切ること、連文節間の区切位置に基づいて音声出力
でのポーズ位置をテキストデータに設定すること、テキ
ストデータをポーズ位置で所定時間ずつ区切りながら音
声出力デバイスに音声出力させること、を実行させるた
めのプログラムが記録されているにより、データ入力デ
バイスとデータ記憶デバイスと音声出力デバイスとが接
続されたコンピュータに、このプログラムを読み取らせ
て対応する動作を実行させると、テキストデータのポー
ズ位置が連文節間の区切位置に基づいて設定されるの
で、簡単なデータ処理でポーズ位置を適正に設定するこ
とができ、テキストデータを自然な位置で区切りながら
音声出力することができる。An information storage medium according to the eleventh aspect of the present invention
Accepting Japanese natural language text data input to the data input device to the computer, temporarily storing the input text data in the data storage device, separating the temporarily stored text data into continuous clauses, A program is recorded to execute a process of setting a pause position in voice output to text data based on a break position between consecutive clauses, and outputting a voice to a voice output device while separating the text data by a predetermined time at pause positions. When the computer connected to the data input device, the data storage device, and the audio output device reads this program and executes the corresponding operation, the pause position of the text data is set at the break position between the continuous phrases. Based on simple data processing. In the pause position can be set properly, it is possible to audio output while delimited text data in a natural position.

【００８０】請求項１２記載の発明の情報記憶媒体は、
コンピュータに、テキストデータを連文節毎に区切るこ
と、連文節間の区切位置に基づいて音声出力でのポーズ
位置をテキストデータに設定すること、を実行させるた
めのプログラムが記録されていることにより、データ入
力デバイスとデータ記憶デバイスと音声出力デバイスと
を備えた音声合成装置のコンピュータに、このプログラ
ムを読み取らせて対応する動作を実行させると、テキス
トデータのポーズ位置が連文節間の区切位置に基づいて
設定されるので、簡単なデータ処理でポーズ位置を適正
に設定することができ、テキストデータを自然な位置で
区切りながら音声出力することができる。According to a twelfth aspect of the present invention, there is provided an information storage medium comprising:
A program for causing a computer to execute a process of dividing text data into continuous phrases, and setting a pause position in audio output to text data based on a break position between continuous phrases, thereby recording data. When a computer of a speech synthesizer having a device, a data storage device, and a speech output device is caused to read this program and execute a corresponding operation, a pause position of text data is set based on a break position between continuous phrases. Therefore, the pause position can be appropriately set by simple data processing, and voice output can be performed while text data is separated at natural positions.

[Brief description of the drawings]

【図１】本発明の実施の第一の形態の音声合成装置の論
理的構造を示す模式図である。FIG. 1 is a schematic diagram showing a logical structure of a speech synthesizer according to a first embodiment of the present invention.

【図２】音声合成装置の物理的構造を示すブロック図で
ある。FIG. 2 is a block diagram showing a physical structure of the speech synthesizer.

【図３】音声合成装置の外観を示す斜視図である。FIG. 3 is a perspective view showing an external appearance of the speech synthesizer.

【図４】品詞列辞書の設定内容を示す模式図である。FIG. 4 is a schematic diagram showing setting contents of a part-of-speech sequence dictionary.

【図５】音声合成装置の音声合成方法を示すフローチャ
ートである。FIG. 5 is a flowchart illustrating a speech synthesis method of the speech synthesis device.

【図６】本発明の実施の第二の形態の音声合成装置の論
理的構造を示す模式図である。FIG. 6 is a schematic diagram illustrating a logical structure of a speech synthesizer according to a second embodiment of the present invention.

【図７】連文節機能辞書の設定内容を示す模式図であ
る。FIG. 7 is a schematic diagram showing setting contents of a continuous phrase function dictionary.

【図８】音声合成方法を示すフローチャートである。FIG. 8 is a flowchart illustrating a speech synthesis method.

【図９】本発明の実施の第三の形態の音声合成装置の論
理的構造を示す模式図である。FIG. 9 is a schematic diagram illustrating a logical structure of a speech synthesizer according to a third embodiment of the present invention.

【図１０】音声合成方法を示すフローチャートである。FIG. 10 is a flowchart illustrating a speech synthesis method.

【図１１】接続尤度辞書の設定内容を示す模式図であ
る。FIG. 11 is a schematic diagram showing setting contents of a connection likelihood dictionary.

[Explanation of symbols]

１，３１，４１音声合成装置２コンピュータ４〜７，９情報記憶媒体５〜７データ記憶デバイス８，１０〜１２，１４，１５データ入力デバイス１６音声出力デバイス２１データ入力手段２２データ記憶手段２３連文節分割手段２４ポーズ設定手段２５音声出力手段２６単語分割手段２７連文節確定手段２８単語辞書２９品詞列辞書３２役割判定手段３３連文節機能辞書４２情報記憶手段４３関係判定手段４４時間調節手段４５接続尤度辞書 1, 31, 41 Voice synthesizer 2 Computer 4-7,9 Information storage medium 5-7 Data storage device 8,10-12,14,15 Data input device 16 Voice output device 21 Data input means 22 Data storage means 23 Continuous clause Division means 24 Pause setting means 25 Voice output means 26 Word division means 27 Continuous phrase determination means 28 Word dictionary 29 Part-of-speech sequence dictionary 32 Role determination means 33 Continuous phrase function dictionary 42 Information storage means 43 Relationship determination means 44 Time adjustment means 45 Connection likelihood dictionary

Claims

[Claims]

1. A data input device for receiving input of various data, a voice output device for outputting voice of various data, a data storage device for temporarily storing various data, and a Japanese language input to the data input device. Data input means for receiving language text data;
Data storage means for temporarily storing the input text data in the data storage device; continuous phrase dividing means for partitioning the temporarily stored text data for each continuous phrase; and a pause position in voice output based on a break position between the continuous phrases. A speech synthesizer comprising: pause setting means for setting text data; and voice output means for outputting voice to the voice output device while separating the text data at a pause position by a predetermined time.

2. A word dictionary in which words forming text data are set together with part-of-speech information, a part-of-speech string dictionary in which a part-of-speech string forming a continuous phrase is provided, The text data is divided into words according to the content, each part of speech is determined, and the part of speech of the text data is collated with the setting of the part of speech sequence dictionary to divide the text data into continuous phrases. 2. The speech synthesizer according to claim 1.

3. The continuous phrase dividing means detects a continuous phrase candidate from text data based on a predetermined grammatical rule regarding the formation of the continuous phrase, and selects the maximum likelihood solution from the candidate to determine the continuous phrase. The speech synthesizer according to claim 1.

4. A role determining means for individually determining a grammatical function of each of the continuous phrases detected from the text data, and the pause setting means sets a pause position corresponding to the determined function of the continuous phrase. 2. The method according to claim 1, wherein
A speech synthesizer as described.

5. A continuous phrase function dictionary in which grammatical functions of the continuous phrase are set, and the role determining means determines the function of the continuous phrase detected from the text data in accordance with the setting contents of the continuous phrase function dictionary. The speech synthesizer according to claim 4, wherein:

6. The speech synthesizing apparatus according to claim 4, wherein the role determining means determines the function of the continuous phrase based on a predetermined grammatical rule regarding a dependency relationship in the continuous phrase.

7. An information storage means for temporarily storing, in a data storage device, various information of a continuous phrase including at least a break position, an order, and a function in text data, and a relation between the continuous phrases based on the temporarily stored various information. And a time adjusting means for individually adjusting a pause time for separating the audio output at the pause position based on the determined relationship. The speech synthesizer according to the description.

8. A connection likelihood dictionary in which connection likelihoods of functions of adjacent clauses are set, and the time adjusting means individually adjusts a pause time corresponding to the connection likelihood. The speech synthesizer according to claim 7.

9. A phrase dividing unit for dividing text data for each predetermined phrase, a pause setting unit sets a pause position also at a division position between phrases in a continuous phrase, and a time adjusting unit sets a pause between phrases. 9. The speech synthesizer according to claim 7, wherein the pause time between consecutive phrases is set longer than the time.

10. Accepting input of text data of Japanese natural language, temporarily storing the input text data, separating the temporarily stored text data into continuous phrases, and outputting a voice based on the separation position between the continuous phrases. Wherein the pause position is set as text data, and the text data is output at predetermined intervals at pause positions.

11. A computer which receives Japanese natural language text data input to a data input device, temporarily stores the input text data in a data storage device, and stores the temporarily stored text data in a continuous phrase. Each time, setting a pause position in voice output to text data based on a break position between consecutive clauses, and causing a voice output device to perform voice output while separating the text data at pause positions by a predetermined time. Information storage medium, characterized in that a program for recording is recorded.

12. A program for causing a computer to separate text data into continuous phrases and to set a pause position in audio output in the text data based on a partition position between consecutive phrases is recorded. An information storage medium characterized by the above-mentioned.