JPH0836395A

JPH0836395A - Generating method for voice data and document reading device

Info

Publication number: JPH0836395A
Application number: JP6204911A
Authority: JP
Inventors: Isamu Iwai; 勇岩井; Kenichiro Kobayashi; 賢一郎小林
Original assignee: Toshiba Corp; Toshiba AVE Co Ltd
Current assignee: Toshiba Corp; Toshiba AVE Co Ltd
Priority date: 1994-05-20
Filing date: 1994-08-30
Publication date: 1996-02-06

Abstract

PURPOSE:To read a document in a natural manner which always fits to the context and the scene of the document. CONSTITUTION:An analysis result buffer 7 stores the Japanese analysis results of a specified document among a document data file 1 done by a Japanese analysis section 4. A voice data generating section 8 generates voice data from the Japanses analysis results. During the data generation, a special reading is assigned to the alphanumeric column located before and after the special character detected by a special character detection section 17 in accordance with the rule corresponding to a special character processing rule table 21. Moreover, when the alphanumeric column of a special pattern is detected by a special pattern detecting section 19, a special reading is assigned to the detected column in accordance with the corresponding rule in a special pattern processing rule table 22, voice data are generated and stored in a voice data file 10. A voice synthesizer 11 converts the voice data into voice signals which are outputted by a voice output section 13. Thus, alphanumeric columns such as telephone numbers and postal code numbers in the document being read are adequately read because the synthesizer 11 converts the voice data to voice signals which are outputted by the section 13.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は読み上げ対象文書を日本
語解析した後、音声データを生成し、この音声データを
音声合成して得た音声を外部に出力することにより文書
を読み上げる文書読上装置に係わり、特に音声データの
生成方法に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention reads a document by analyzing the reading target document in Japanese, generating voice data, and outputting the voice obtained by synthesizing the voice data to the outside. The present invention relates to a device, and more particularly to a method of generating audio data.

【０００２】[0002]

【従来の技術】従来の音声読上装置においては、読み上
げ対象文書データを日本語解析して得られる解析結果の
中で、例えば数字の文字列に対しては予め設定されてい
るモードによって棒読み、又は桁読みのどちらかの読み
で音声データの生成が行なわれていた。具体的な例を上
げると、前記解析結果として、「０３（１２３）４５６
７」が得られた場合、この数字の文字列に対して、音声
データ生成部は、通常、「ぜろさ＾ん／か＾っこ／ひゃ
くに＾じゅうさん／か＾っこ／よんせ＾ん／ごひゃくろ
くじゅうな＾な」という読みの生成を行なっており、電
話番号であった場合の慣例的な読み方、「ぜろさ＾んの
／いちに＾いさんの／よんご−ろくな＾な」という読み
を生成することがなかった。2. Description of the Related Art In a conventional speech reading apparatus, in an analysis result obtained by analyzing the reading target document data in Japanese, for example, for a character string of numbers, a bar reading is performed in a preset mode, The voice data is generated by reading either the digit or the digit. As a concrete example, as the analysis result, "03 (123) 456" is obtained.
If “7” is obtained, the voice data generation unit normally responds to “Zerosa ^ // ^^ / hyakuni ^ 30 // ^^ / onse for this numeric character string. It generates a reading "^ / Gohyakukurojana ^ na", which is the customary way of reading when it is a telephone number, "Zerosa ^ no / Ichinii ^ san / Yongo-Rokuna". I didn't generate a reading saying "^".

【０００３】又、前記解析結果として、読み方に何種類
かのバリエーションがある「１０：１３」等の特殊読み
を行う記号を含んだ数字の文字列が得られた場合も、音
声データ生成部は前後に出現する単語により「じゅった
＾い／じゅ＾うさん」が「じゅ＾うじ／＾じゅうさ＾ん
ぷん」かを読み分けなければならないが、これも予め決
められた設定どおりのいずれか一方の読み方しかできな
かった。従って、従来の文書読み上げ装置によって文書
データを読み上げさせた際、上記のような例が出てきた
場合、必ずしもその文脈や場面に適した自然な読み上げ
方を行わないため、聴取者に違和感を与えると共に場合
によって聴取者が意味を取り違えてしまうという不具合
があった。尚、本例では数字文字列の中にかっこ
や「：」のような特殊記号を含むものも、一括して数字
文字列と読んでいる。Also, even if a character string of a number including a symbol for special reading such as "10:13", which has some variations in reading, is obtained as the analysis result, the voice data generating section Depending on the words that appear before and after, it is necessary to distinguish whether "juta ^ i / ju ^ u-san" is "ju ^ uji / ^ jusa ^ mpun", which is either one of the preset settings. I could only read it. Therefore, when reading the document data with the conventional document reading device, if the above example appears, it does not necessarily make a natural reading suitable for the context or scene, which gives the listener a feeling of discomfort. At the same time, there was a problem that listeners sometimes got the meaning wrong. Incidentally, in this example, those including special symbols such as parentheses and ":" in the numeric character string are collectively read as the numeric character string.

【０００４】[0004]

【発明が解決しようとする課題】上記のような従来の音
声読み上げ装置では、読み上げ対象文書データの日本語
解析結果が、電話番号や郵便番号を示す数字文字列であ
った場合や、時刻や比率を示す特殊記号を含む数字文字
列であった場合、予め設定してある読み方しか行なわれ
ないため、文脈や場面に対して適した自然な読み上げ方
をしない場合が生じ、このような場合には聴取者に違和
感を与えると共に、場合によって聴取者が意味を取り違
えてしまうという不具合があった。In the conventional voice reading device as described above, when the Japanese analysis result of the read target document data is a numeric character string indicating a telephone number or a postal code, the time and ratio are set. If it is a numeric character string containing a special symbol indicating, only the preset reading is performed, so there are cases where natural reading appropriate for the context or scene is not done. There is a problem that the listener feels uncomfortable and that the listener sometimes misunderstands the meaning.

【０００５】そこで本発明は上記の欠点を除去し、読み
上げ対象の文書データの日本語解析結果中に現れる特殊
な単語によって慣例的に特殊な読み方をする数字文字列
があった場合、これを検出して前記特殊な読み方の音声
データを生成することができると共に、読み上げ対象の
文書データの日本語解析結果中に慣例的に特殊な読み方
をする特殊パターンの数字文字列が現れた場合、これを
検出して前記特殊な読み方の音声データを生成すること
ができる音声データの生成方法を提供することを目的と
すると共に、更に、この音声データの生成方法により生
成した音声データから音声合成して音声信号を作成し
て、これを外部に出力することにより、常に文脈や場面
に適した自然な読み上げを行うことができる文書読上装
置を提供することを目的としている。Therefore, the present invention eliminates the above-mentioned drawbacks, and detects a numerical character string which is conventionally read in a special way by a special word appearing in the Japanese analysis result of the document data to be read out. In addition to being able to generate the voice data of the special reading, if a numeric character string of a special pattern that conventionally has a special reading appears in the Japanese analysis result of the document data to be read, An object of the present invention is to provide a voice data generation method capable of detecting and generating voice data of the special reading, and further, voice-synthesizing voice from voice data generated by the voice data generation method. By creating a signal and outputting it to the outside, it is possible to provide a document reading device that can always perform natural reading suitable for the context or scene. It is the target.

【０００６】[0006]

【課題を解決するための手段】請求項１の発明は、計算
機上で扱える形をした文書データに日本語解析を施して
得られた解析結果から音声データ生成規則に従って音声
データに生成する音声データ生成方法にあって、日本語
解析結果から予め定められた特殊単語を検出すると、こ
の特殊単語の前後に近接して存在する数字文字列に、前
記検出された特殊単語に対応した規則に従った読みを当
てることによって、前記数字文字列から音声データを生
成する方法を有する。According to a first aspect of the present invention, voice data is generated as voice data according to a voice data generation rule from an analysis result obtained by performing Japanese analysis on document data in a form that can be handled by a computer. In the generating method, when a predetermined special word is detected from the Japanese analysis result, the numeric character string existing before and after the special word is followed by the rule corresponding to the detected special word. It has a method of generating voice data from the numerical character string by applying a reading.

【０００７】請求項２の発明は、計算機上で扱える形を
した文書データに日本語解析を施して得られた解析結果
から音声データ生成規則に従って音声データに生成する
音声データ生成方法にあって、日本語解析結果から予め
定められた特殊パターンの数字文字列を検出すると、こ
の数字文字列に前記パターンに対応した規則に従った読
みを当てることによって、前記数字文字列から音声デー
タを生成する方法を有する。A second aspect of the present invention is a voice data generating method for generating voice data according to a voice data generating rule from an analysis result obtained by performing Japanese analysis on document data in a form that can be handled on a computer. A method of generating voice data from a numeric character string by detecting a numeric character string of a predetermined special pattern from a Japanese analysis result and applying a reading according to a rule corresponding to the pattern to the numeric character string. Have.

【０００８】請求項３の発明は、計算機上で扱える形を
した文書データに日本語解析を施して得られた解析結果
から音声データ生成規則に従って音声データに生成する
音声データ生成方法にあって、日本語解析結果から予め
定められた特殊単語を検出すると、この特殊単語の前後
に近接して存在する数字文字列に、前記検出された特殊
単語に対応した規則に従った読みを当てることによっ
て、前記数字文字列から音声データを生成し、或いは、
前記日本語解析結果から予め定められた特殊パターンの
数字文字列を検出すると、この数字文字列に前記パター
ンに対応した規則に従った読みを当てることによって、
前記数字文字列から音声データを生成する方法を有す
る。A third aspect of the present invention is a voice data generating method for generating voice data according to a voice data generating rule from an analysis result obtained by performing Japanese analysis on document data in a form that can be handled on a computer. When a predetermined special word is detected from the Japanese analysis result, by applying a reading according to the rule corresponding to the detected special word to the numeric character strings existing before and after the special word, Generate voice data from the numeric character string, or
When a numeric character string of a predetermined special pattern is detected from the Japanese analysis result, by applying a reading according to the rule corresponding to the pattern to the numeric character string,
There is a method of generating voice data from the numeric character string.

【０００９】請求項４の発明は、前記特殊単語の検出と
特殊パターンの数字文字列の検出が同一の数字文字列に
対して同時発生した場合、予め設定された優先順位の高
い方の検出から起動される音声データ生成処理を実施し
て、該当の数字文字列から音声データを生成する方法を
有する。According to a fourth aspect of the present invention, when the detection of the special word and the detection of the numeric character string of the special pattern occur at the same time for the same numeric character string, the one having a higher priority set in advance is detected. The method has a method of executing voice data generation processing to be activated and generating voice data from a corresponding numeric character string.

【００１０】請求項５の発明は、前記特殊単語の検出に
より起動される音声データ処理を優先する方法を有す
る。A fifth aspect of the present invention has a method of prioritizing voice data processing activated by detection of the special word.

【００１１】請求項６の発明は、読み上げ対象の文書デ
ータを日本語解析して得た解析結果から音声データ生成
規則に従って音声データを生成し、この音声データを音
声合成装置により電気的な音声信号に変換し、得られた
音声信号を音声出力装置により音声にして外部に出力す
る文書読上装置において、前記日本語解析結果から予め
定められた特殊単語を検出する特殊単語検出手段と、こ
の特殊単語検出手段により検出された特殊単語の前後に
近接して存在する数字文字列を抽出する抽出手段と、こ
の抽出手段により抽出された数字文字列に、前記検出手
段により検出された特殊単語に対応して予め決められた
規則に従った読みを当てることにより前記数字文字列か
ら音声データに生成する音声データ生成手段とを具備し
た構成を有する。According to a sixth aspect of the present invention, voice data is generated according to a voice data generation rule from an analysis result obtained by analyzing the document data to be read out in Japanese, and the voice data is converted into an electric voice signal by a voice synthesizer. In a document reading device which converts the obtained voice signal into a voice by a voice output device and outputs the voice to the outside, a special word detecting means for detecting a predetermined special word from the Japanese analysis result, and a special word detecting device. Extraction means for extracting numeric character strings existing before and after the special word detected by the word detection means, and the numeric character string extracted by this extraction means corresponding to the special word detected by the detection means And a voice data generating means for generating voice data from the numerical character string by applying reading according to a predetermined rule.

【００１２】請求項７の発明は、前記特殊単語検出手段
は複数の特殊単語を一覧としたテーブルデータを有し、
前記日本語解析結果に現れる単語と前記テーブルデータ
とを照合することによって前記特殊単語を検索する構成
を有する。According to a seventh aspect of the present invention, the special word detecting means has table data listing a plurality of special words,
The special word is searched by matching the word appearing in the Japanese analysis result with the table data.

【００１３】請求項８の発明は、前記音声データ生成手
段は前記複数の特殊単語毎に対応して決められた規則を
一覧としたテーブルデータを有し、前記抽出手段により
抽出された数字文字列に、前記検出手段により検出され
た特殊単語に対応した前記テーブルデータ内の規則に従
った読みを当てることにより、前記数字文字列から音声
データに生成する構成を有する。According to an eighth aspect of the present invention, the voice data generating means has table data in which a list of rules determined corresponding to each of the plurality of special words is listed, and the numeric character string extracted by the extracting means. In addition, by applying reading according to the rule in the table data corresponding to the special word detected by the detecting means, the numerical character string is generated into voice data.

【００１４】請求項９の発明は、読み上げ対象の文書デ
ータを日本語解析して得た解析結果から音声データ生成
規則に従って音声データを生成し、この音声データを音
声合成装置により電気的な音声信号に変換し、得られた
音声信号を音声出力装置により音声にして外部に出力す
る文書読上装置において、前記日本語解析結果から予め
定められた特殊パターンの数字文字列を検出する特殊パ
ターン検出手段と、この特殊パターン検出手段により検
出された特殊パターンの数字文字列に、この特殊パター
ンに対応して予め決められた規則に従った読みを当てる
ことにより前記数字文字列から音声データを生成する音
声データ生成手段とを具備した構成を有する。According to a ninth aspect of the present invention, voice data is generated according to a voice data generation rule from an analysis result obtained by analyzing the document data to be read aloud in Japanese, and the voice data is converted into an electric voice signal by a voice synthesizer. In the document reading device which converts the obtained voice signal into a voice by the voice output device and outputs the voice to the outside, a special pattern detecting means for detecting a numeric character string of a predetermined special pattern from the Japanese analysis result. And a voice that generates voice data from the numeric character string by applying reading to the numeric character string of the special pattern detected by the special pattern detection means in accordance with a predetermined rule corresponding to the special pattern. And a data generating means.

【００１５】請求項１０の発明は、前記特殊パターン検
出手段は複数の特殊パターン数字文字列を一覧としたテ
ーブルデータを有し、前記日本語解析結果に現れる数字
文字列と前記テーブルデータとを照合することによって
前記特殊単語を検索する構成を有する。According to a tenth aspect of the present invention, the special pattern detection means has table data in which a plurality of special pattern numeric character strings are listed, and the numeric character strings appearing in the Japanese analysis result are collated with the table data. By doing so, the special word is searched for.

【００１６】請求項１１の発明は、前記音声データ生成
手段は前記複数の特殊パターンの数字文字列毎に対応し
て決められた規則を一覧としたテーブルデータを有し、
前記検出手段により検出された特殊パターンの数字文字
列に、この特殊パターンに対応した前記テーブルデータ
内の規則に従った読みを当てることにより前記数字文字
列から音声データを生成する構成を有する。According to an eleventh aspect of the present invention, the voice data generating means has table data in which a list of rules determined corresponding to each of the plurality of special pattern numeric character strings is provided.
The numeric character string of the special pattern detected by the detection means is read according to the rule in the table data corresponding to the special pattern, and the voice data is generated from the numeric character string.

【００１７】請求項１２の発明は、読み上げ対象の文書
データを日本語解析して得た解析結果から音声データ生
成規則に従って音声データを生成し、この音声データを
音声合成装置により電気的な音声信号に変換し、得られ
た音声信号を音声出力装置により音声にして外部に出力
する文書読上装置において、前記日本語解析結果から予
め定められた特殊単語を検出する特殊単語検出手段と、
この特殊単語検出手段により検出された特殊単語の前後
に近接して存在する数字文字列を抽出する抽出手段と、
この抽出手段により抽出された数字文字列に、前記特殊
単語検出手段により検出された特殊単語に対応して予め
決められた規則に従った読みを当てることにより前記数
字文字列から音声データを生成する第１の音声データ生
成手段と、前記日本語解析結果から予め定められた特殊
パターンの数字文字列を検出する特殊パターン検出手段
と、この特殊パターン検出手段により検出された数字文
字列に、この数字文字列が有する特殊パターンに対応し
て予め決められた規則に従った読みを当てることにより
前記数字文字列から音声データを生成する第２の音声デ
ータ生成手段とを具備した構成を有する。According to a twelfth aspect of the present invention, voice data is generated according to a voice data generation rule from the analysis result obtained by analyzing the document data to be read out in Japanese, and this voice data is converted into an electric voice signal by a voice synthesizer. In the document reading device that converts the obtained voice signal into a voice by a voice output device and outputs the voice to the outside, a special word detection unit that detects a predetermined special word from the Japanese analysis result,
Extraction means for extracting numeric character strings existing before and after the special word detected by the special word detection means,
Voice data is generated from the numeric character string by applying reading to the numeric character string extracted by the extracting means according to a predetermined rule corresponding to the special word detected by the special word detecting means. The first voice data generating means, the special pattern detecting means for detecting a numeric character string of a predetermined special pattern from the Japanese analysis result, and the numeric character string detected by the special pattern detecting means. A second voice data generating means for generating voice data from the numeric character string by applying reading according to a predetermined rule corresponding to the special pattern of the character string.

【００１８】請求項１３発明は、前記特殊単語検出手段
による特殊単語の検出により起動される音声データ生成
処理と、前記特殊パターン検出手段による特殊パターン
の数字文字列の検出により起動される音声データ生成処
理のいずれを優先するかを設定する設定手段と、前記抽
出手段により抽出された数字文字列と前記特殊パターン
検出手段によって検出された数字文字列とが一致したか
どうかを判定する判定手段と、この判定手段が前記両数
字文字列の一致を判定した場合、前記設定手段に設定さ
れている優先情報に従った音声データ生成処理を実行す
る制御手段とを具備した構成を有する。According to a thirteenth aspect of the present invention, a voice data generation process activated by detection of a special word by the special word detection means and a voice data generation activated by detection of a numeric character string of a special pattern by the special pattern detection means. Setting means for setting which of the processing is prioritized, determination means for determining whether or not the numeric character string extracted by the extraction means and the numeric character string detected by the special pattern detection means are matched, When the determination means determines that the two numeric character strings match, the control means executes a voice data generation process according to the priority information set in the setting means.

【００１９】[0019]

【作用】請求項１の発明の音声データの生成方法におい
て、日本語解析結果から音声データを生成中に、前記日
本語解析結果に予め定められた特殊単語があると、これ
を検出すると共に、この検出した特殊単語の前後に近接
して存在する数字文字列に、前記検出された特殊単語に
対応した規則に従った読みを当てる。In the method of generating voice data according to the first aspect of the present invention, when the voice data is being generated from the Japanese analysis result, if there is a predetermined special word in the Japanese analysis result, it is detected and The numerical character strings existing before and after the detected special word are read according to the rule corresponding to the detected special word.

【００２０】請求項２の発明の音声データの生成方法に
おいて、日本語解析結果から音声データを生成中に、前
記日本語解析結果に予め定められた特殊パターンの数字
文字列があると、これが検出されるため、この検出され
た特殊パターンの数字文字列に対応した規則に従った読
みを当てる。In the method of generating voice data according to the second aspect of the present invention, when the voice data is being generated from the Japanese analysis result, if a numeric character string of a predetermined special pattern is present in the Japanese analysis result, this is detected. Therefore, the reading according to the rule corresponding to the detected numeric character string of the special pattern is applied.

【００２１】請求項３の発明の音声データの生成方法に
おいて、日本語解析結果から音声データを生成中に、前
記日本語解析結果に予め定められた特殊単語があると、
これが検出すると共に、この検出された特殊単語の前後
に近接して存在する数字文字列に、前記検出された特殊
単語に対応した規則に従った読みを当てる。これによ
り、前記数字文字列から前記特殊単語で限定される場面
又は文脈に合った読みの音声データが作成され、或い
は、前記日本語解析結果に予め定められた特殊パターン
の数字文字列があると、これが検出されるため、この検
出された特殊パターンの数字文字列に対応した規則に従
った読みを当てる。In the method for generating voice data according to the third aspect of the present invention, when the voice data is being generated from the Japanese analysis result, there is a predetermined special word in the Japanese analysis result,
When this is detected, the numeric character strings existing before and after the detected special word are read according to the rule corresponding to the detected special word. As a result, reading voice data that matches the scene or context limited by the special word is created from the numeric character string, or if there is a numeric character string of a special pattern that is predetermined in the Japanese analysis result. Since this is detected, the reading according to the rule corresponding to the detected numeric character string of the special pattern is applied.

【００２２】請求項４の発明の音声データの生成方法に
おいて、前記特殊単語の検出と特殊パターンの数字文字
列の検出が同一の数字文字列に対して同時発生した場
合、予め設定された優先順位の高い方の検出から起動さ
れる音声データ生成処理を実施して、該当の数字文字列
から音声データを生成する。In the method of generating voice data according to the present invention, when the detection of the special word and the detection of the numeric character string of the special pattern occur simultaneously for the same numeric character string, a preset priority order is set. The voice data generation process that is activated from the detection of the higher one is performed to generate voice data from the corresponding numeric character string.

【００２３】請求項５の発明の音声データの生成方法に
おいて、前記特殊単語の検出により起動される音声デー
タ処理を優先する。In the method for generating voice data according to the fifth aspect of the present invention, the voice data processing activated by the detection of the special word is prioritized.

【００２４】請求項６の発明の文書読上装置において、
特殊単語検出手段は日本語解析結果から予め定められた
特殊単語を検出する。抽出手段は前記特殊単語検出手段
により検出された特殊単語の前後に近接して存在する数
字文字列を抽出する。音声データ生成手段は前記抽出手
段により抽出された数字文字列に、前記検出手段により
検出された特殊単語に対応して予め決められた規則に従
った読みを当てることにより前記数字文字列から音声デ
ータに生成する。In the document reading apparatus of the invention of claim 6,
The special word detection means detects a predetermined special word from the Japanese analysis result. The extracting means extracts a numeric character string existing before and after the special word detected by the special word detecting means. The voice data generating means applies voice reading to the numeric character string extracted by the extracting means according to a predetermined rule corresponding to the special word detected by the detecting means, thereby converting the voice data from the numeric character string. To generate.

【００２５】請求項７の発明の文書読上装置において、
前記特殊単語検出手段は複数の特殊単語を一覧としたテ
ーブルデータを有し、前記日本語解析結果に現れる単語
と前記テーブルデータとを照合することによって前記特
殊単語を検索する。In the document reading apparatus of the invention of claim 7,
The special word detecting means has table data listing a plurality of special words, and searches for the special word by matching the words appearing in the Japanese analysis result with the table data.

【００２６】請求項８の発明の文書読上装置において、
前記音声データ生成手段は前記複数の特殊単語毎に対応
して決められた規則を一覧としたテーブルデータを有
し、前記抽出手段により抽出された数字文字列に、前記
検出手段により検出された特殊単語に対応した前記テー
ブルデータ内の規則に従った読みを当てることにより、
前記数字文字列から音声データに生成する。In the document reading apparatus according to the invention of claim 8,
The voice data generating means has table data in which a list of rules determined corresponding to each of the plurality of special words is listed, and the special character detected by the detecting means is added to the numeric character string extracted by the extracting means. By applying the reading according to the rule in the table data corresponding to the word,
The voice data is generated from the numeric character string.

【００２７】請求項９の発明の文書読上装置において、
特殊パターン検出手段は日本語解析結果から予め定めら
れた特殊パターンの数字文字列を検出する。音声データ
生成手段は前記特殊パターン検出手段により検出された
特殊パターンの数字文字列に、この特殊パターンに対応
して予め決められた規則に従った読みを当てることによ
り前記数字文字列から音声データを生成する。In the document reading apparatus of the invention of claim 9,
The special pattern detecting means detects a numeric character string of a predetermined special pattern from the Japanese analysis result. The voice data generation means applies voice data to the numeric character string of the special pattern detected by the special pattern detection means in accordance with a predetermined rule corresponding to the special pattern to obtain voice data from the numeric character string. To generate.

【００２８】請求項１０の発明の文書読上装置におい
て、前記特殊パターン検出手段は複数の特殊パターン数
字文字列を一覧としたテーブルデータを有し、前記日本
語解析結果に現れる数字文字列と前記テーブルデータと
を照合することによって前記特殊パターン数字文字列を
検索する。In the document reading apparatus of the tenth aspect of the invention, the special pattern detecting means has table data listing a plurality of special pattern numeric character strings, and the numeric character strings appearing in the Japanese analysis result and the numeric character string. The special pattern numeric character string is searched by collating with the table data.

【００２９】請求項１１の発明の文書読上装置におい
て、前記音声データ生成手段は前記複数の特殊パターン
の数字文字列毎に対応して決められた規則を一覧とした
テーブルデータを有し、前記検出手段により検出された
特殊パターンの数字文字列に、この特殊パターンに対応
した前記テーブルデータ内の規則に従った読みを当てる
ことにより前記数字文字列から音声データを生成する。In the document reading apparatus according to the invention of claim 11, the voice data generating means has table data in which a list of rules decided corresponding to each of the plurality of special pattern numeric character strings is included, The numeric character string of the special pattern detected by the detecting means is read according to the rule in the table data corresponding to the special pattern to generate voice data from the numeric character string.

【００３０】請求項１２の発明の文書読上装置におい
て、特殊単語検出手段は日本語解析結果から予め定めら
れた特殊単語を検出する。抽出手段は前記特殊単語検出
手段により検出された特殊単語の前後に近接して存在す
る数字文字列を抽出する。第１の音声データ生成手段は
前記抽出手段により抽出された数字文字列に、前記特殊
単語検出手段により検出された特殊単語に対応して予め
決められた規則に従った読みを当てることにより前記数
字文字列から音声データを生成する。特殊パターン検出
手段は前記日本語解析結果から予め定められた特殊パタ
ーンの数字文字列を検出する。第２の音声データ生成手
段は前記特殊パターン検出手段により検出された数字文
字列に、前記この数字文字列が有する特殊パターンに対
応して予め決められた規則に従った読みを当てることに
より、前記数字文字列から音声データを生成する。In the document reading apparatus of the twelfth aspect of the invention, the special word detecting means detects a predetermined special word from the Japanese analysis result. The extracting means extracts a numeric character string existing before and after the special word detected by the special word detecting means. The first voice data generating means applies the reading according to a predetermined rule corresponding to the special word detected by the special word detecting means to the numeral character string extracted by the extracting means, thereby reading the numeral. Generate voice data from a character string. The special pattern detecting means detects a numeric character string of a predetermined special pattern from the Japanese analysis result. The second voice data generation means applies the reading according to a rule determined in advance corresponding to the special pattern of the numeric character string to the numeric character string detected by the special pattern detection means, thereby Generates voice data from a numeric character string.

【００３１】請求項１３の発明の文書読上装置におい
て、設定手段は前記特殊単語検出手段による特殊単語の
検出により起動される音声データ生成処理と、前記特殊
パターン検出手段による特殊パターンの数字文字列の検
出により起動される音声データ生成処理のいずれを優先
するかを設定する。判定手段は前記抽出手段により抽出
された数字文字列と前記特殊パターン検出手段によって
検出された数字文字列とが一致したかどうかを判定す
る。制御手段は前記判定手段が前記両数字文字列の一致
を判定した場合、前記設定手段に設定されている優先情
報に従った音声データ生成処理を実行する。In the document reading apparatus according to the thirteenth aspect of the present invention, the setting means is a voice data generating process activated by the detection of the special word by the special word detecting means, and a numeric character string of the special pattern by the special pattern detecting means. It is set which of the voice data generation processes activated by the detection of is prioritized. The determining means determines whether or not the numeric character string extracted by the extracting means matches the numeric character string detected by the special pattern detecting means. When the determination means determines that the two numerical character strings match each other, the control means executes a voice data generation process according to the priority information set in the setting means.

【００３２】[0032]

【実施例】以下、本発明の一実施例を図面を参照して説
明する。図１は本発明の音声データの生成方法及びこの
方法を用いた本発明の文書読上装置の一実施例を示した
ブロック図である。１は計算機上で扱える形の文書デー
タを格納している文書データファイル、２は読み上げ時
の各種設定データ等を入力する入力装置、３は文書デー
タを読み上げる際の総合的な制御を行う制御部、４は読
み上げる文書データを単語辞書６を参照して形態的、構
文的及び意味的に解析する日本語解析部、５は読み上げ
時の各種設定データが保存される設定バッファ、６は文
書データを解析するための見出し、品詞、読み、アクセ
ント、意味、その他の情報が一覧となって収集されてい
る単語辞書、７は日本語解析部４による解析結果を保存
する解析結果バッファ、８は日本語解析部４の解析結果
に対応する音声データを生成する音声データ生成部、９
は前記音声データ生成部８で音声データを生成する際に
参照される音声データ生成規則を格納している音声デー
タ生成規則ファイル、１０は音声データ生成部８により
生成された音声データを保存する音声データファイル、
１１は音声データファイル１０から読み出された音声デ
ータに基づいて音声信号を規則合成する音声合成装置、
１２は「郵便番号」や「比率等」のように、これら文字
の前後に続く数字文字列を特殊な読み方で読む特殊文字
を一覧として保持している特殊文字テーブル、１３は音
声信号を出力するスピーカ等の音声出力部、１４は表示
データを画面上に表示するＣＲＴやＬＣＤ等の表示装
置、１５は表示装置１４に表示する表示データを音声デ
ータから作成する表示データ作成部、１６は表示データ
を保存する表示データファイル、１７は日本語解析結果
から特殊文字テーブル１２に保持されている特殊文字を
検出する特殊文字検出部、１８は電話番号を示した数字
文字列のような特殊な数字文字列のパターンを一覧とし
て保持している特殊パターンテーブル、１９は日本語解
析結果から特殊パターンテーブル１８に保持されている
特殊パターンを検出する特殊パターン検出部、２０は特
殊数字文字列に対する音声データの処理の際に特殊文字
処理規則２１又は特殊パターン処理規則テーブル２２内
の規則情報を読みだして音声データ生成部８に与える特
殊数字文字処理部、２１は前記特殊文字に対する音声デ
ータ生成規則を一覧として保持している特殊文字処理規
則テーブル、２２は前記した特殊な数字文字列のパター
ンに対する音声データ生成規則を一覧として保持してい
る特殊パターン処理規則テーブルである。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS An embodiment of the present invention will be described below with reference to the drawings. FIG. 1 is a block diagram showing an embodiment of a voice data generating method of the present invention and a document reading apparatus of the present invention using this method. Reference numeral 1 is a document data file storing document data in a form that can be handled on a computer, 2 is an input device for inputting various setting data at the time of reading, and 3 is a control unit for performing comprehensive control when reading the document data. Reference numeral 4 denotes a Japanese analysis unit that morphologically, syntactically and semantically analyzes the read document data with reference to the word dictionary 6, 5 is a setting buffer in which various setting data at the time of reading is stored, and 6 is document data. A word dictionary in which headings, parts of speech, readings, accents, meanings, and other information for analysis are collected as a list, 7 is an analysis result buffer for storing the analysis results by the Japanese analysis unit 4, and 8 is Japanese. A voice data generation unit that generates voice data corresponding to the analysis result of the analysis unit 4, 9
Is a voice data generation rule file storing a voice data generation rule referred to when the voice data generation unit 8 generates voice data, and 10 is a voice for storing the voice data generated by the voice data generation unit 8. data file,
Reference numeral 11 denotes a voice synthesizing device for regularly synthesizing voice signals based on the voice data read from the voice data file 10.
12 is a special character table that holds a list of special characters such as "zip code" and "ratio" that read the numerical character strings preceding and following these characters in a special reading manner, and 13 outputs a voice signal. A voice output unit such as a speaker, 14 is a display device such as a CRT or LCD that displays display data on the screen, 15 is a display data creation unit that creates display data to be displayed on the display device 14 from voice data, and 16 is display data Is a display data file for storing, 17 is a special character detection unit for detecting a special character held in the special character table 12 from the Japanese analysis result, and 18 is a special numeric character such as a numeric character string indicating a telephone number. A special pattern table that holds column patterns as a list, and 19 detects special patterns that are held in the special pattern table 18 from the Japanese analysis result. The special pattern detecting unit 20 reads the rule information in the special character processing rule 21 or the special pattern processing rule table 22 when processing the voice data for the special numeric character string and gives the special numeric character to the voice data generating unit 8. A processing unit, 21 is a special character processing rule table that holds a list of voice data generation rules for the special characters, and 22 is a special that holds a list of voice data generation rules for the special numeric character string patterns described above. It is a pattern processing rule table.

【００３３】次に本実施例の動作について図２に示した
フローチャートを参照して説明する。文書データファイ
ル１に格納されている文書データは計算機により処理可
能な形で格納された複数の文書であり、これら文書デー
タは日本語ワードプロセッサ等の文書作成装置により作
成されたり、ＯＣＲ等により計算機の中に読み込まれた
ものである。Next, the operation of this embodiment will be described with reference to the flow chart shown in FIG. The document data stored in the document data file 1 is a plurality of documents stored in a form that can be processed by a computer, and these document data are created by a document creation device such as a Japanese word processor or by a computer such as OCR. It has been read in.

【００３４】キーボード等の入力装置２からの指示を制
御部３が受けることにより文書読み上げ処理は開始され
る。この時、制御部３に対しては、オペレータによる入
力装置２からの入力により、読み上げの対象となる文書
の指定と、その文書を読み上げる条件が与えられる。オ
ペレータはこの作業を制御部３によりＣＲＴやＬＣＤ等
の表示装置１４に表示されているガイドに従いながら設
定することが可能である。When the control unit 3 receives an instruction from the input device 2 such as a keyboard, the document reading process is started. At this time, the control unit 3 is given an instruction from the input device 2 by an operator to specify a document to be read aloud and a condition for reading the document. The operator can set this work by the control unit 3 while following the guide displayed on the display device 14 such as a CRT or LCD.

【００３５】上記のようなオペレータの操作を受けて、
制御部３は図２のステップ２０１にて入力装置２から指
定された文書を文書データファイル１から読みだして、
読み上げる文書の選択処理を行い、この選択した文書を
日本語解析部４に渡す。次に制御部３はステップ２０２
にて入力装置２から入力される読み上げの条件を設定バ
ッファ５に対して書き込む。ここで、オペレータの入力
装置２からの入力により指定された前記文書データの一
部は、例えば図３に示すようなものである。又、オペレ
ータは制御部３が表示装置１４に対して表示されている
ガイドに従いながら入力装置２から読み上げ時の条件を
設定することも可能で、設定できるものとしては、基本
の読み上げ速度、音質、高さ、強さ、読み上げ終了の設
定時間、強調文字の特殊読み上げの有無、強調文字の特
殊読み上げ時の変更点、読み上げの有無、休みの長さ、
特殊文字の読みの入れ替えの有無等があり、設定バッフ
ァ５に対して図４に示すように上記条件データが格納さ
れる。In response to the operation of the operator as described above,
The control unit 3 reads the document specified by the input device 2 from the document data file 1 in step 201 of FIG.
A process of selecting a document to be read is performed, and the selected document is passed to the Japanese analysis unit 4. Next, the control unit 3 executes step 202
The reading condition input from the input device 2 is written in the setting buffer 5. Here, a part of the document data specified by the input from the operator's input device 2 is as shown in FIG. 3, for example. Further, the operator can set the reading condition from the input device 2 while the control unit 3 follows the guide displayed on the display device 14. The settings include basic reading speed, sound quality, Height, strength, set time for reading end, presence / absence of special reading of emphasized characters, changes during special reading of emphasized characters, presence / absence of reading, rest length,
Whether or not the reading of special characters has been changed, etc., the above condition data is stored in the setting buffer 5 as shown in FIG.

【００３６】制御部３より指示を受けた日本語解析部４
は、ステップ２０３にて前記渡された文書に対して形態
的、構文的、意味的に解析を、単語の形態情報、読み情
報、アクセント情報、単語間の共起情報等を収めた単語
辞書６を参照しながら行なうことにより、文書を単語単
位に切り分け、単語毎に分割、解析された結果を解析結
果バッファ７に対して書き込んだ後、制御部３に対して
文書の解析が終了したことを示す信号を送る。The Japanese analysis unit 4 which receives an instruction from the control unit 3.
Is a word dictionary 6 containing morphologically, syntactically and semantically analyzed the passed document in step 203, and morphological information, reading information, accent information, co-occurrence information between words and the like. The document is divided into words, divided into words, and the analyzed result is written in the analysis result buffer 7, and then the control unit 3 is informed that the document analysis is completed. Send a signal to indicate.

【００３７】ここで、前記日本語解析部４が用いる単語
辞書６は図５に示すようなデータ構造例を有しており、
又、日本語解析部４による解析結果が格納される解析結
果バッファ７は図６に示すようなデータ構造例を有して
いる。尚、上記した日本語解析部４が文書を解析して読
みを導く際に、一つの単語に対して複数の読みが存在す
る場合には、上記した単語辞書中に記述されている共起
情報に基づいて適切な読みを決定するものとする。Here, the word dictionary 6 used by the Japanese analysis section 4 has a data structure example as shown in FIG.
The analysis result buffer 7 in which the analysis result by the Japanese analysis unit 4 is stored has a data structure example as shown in FIG. When the Japanese analysis unit 4 analyzes the document and guides the reading, and when there are a plurality of readings for one word, the co-occurrence information described in the word dictionary is used. The appropriate reading shall be determined based on.

【００３８】文書の解析が終了したことを示す信号を受
けた制御部３は音声データ生成部８に対して起動をかけ
る。音声データ生成部８は音声データ生成規則ファイル
９内のデータを参照して解析結果バッファ７内の日本語
解析結果に対応する音声データを生成して、これを音声
データファイルに格納する。この際、音声データ生成部
８は特殊文字テーブル１２を参照して、解析結果バッフ
ァ７内の日本語解析結果の中に図１０に示すような特殊
文字テーブル１２に登録されている特殊文字（単語）が
存在するかどうかをステップ２０４にて判定し、存在す
る場合は前記特殊文字テーブル１２内の前記検索された
特殊文字に対応するルールを読み出した後、ステップ２
０７に進み、存在しない場合はステップ２０５に進む。Upon receiving the signal indicating that the analysis of the document is completed, the control unit 3 activates the voice data generation unit 8. The voice data generation unit 8 refers to the data in the voice data generation rule file 9 to generate voice data corresponding to the Japanese analysis result in the analysis result buffer 7, and stores this in the voice data file. At this time, the voice data generation unit 8 refers to the special character table 12 to find the special characters (words) registered in the special character table 12 shown in FIG. 10 in the Japanese analysis result in the analysis result buffer 7. ) Is present in step 204, and if it is present, the rule corresponding to the searched special character in the special character table 12 is read out, and then step 2
If not, the process proceeds to step 205.

【００３９】音声データ生成部８はステップ２０７に進
んだ場合、前記検出された特殊文字の前後にある数字文
字列を前記日本語解析結果から抽出し、この抽出した数
字文字列と前記特殊文字テーブル１２から読み出した前
記ルールを特殊数字文字処理部１５に与えて、これに起
動をかける。これにより、特殊数字文字処理部１５はス
テップ２０８にて図１１に示すような特殊文字処理規則
テーブル２１内のデータの中で、前記与えられたルール
に対応する規則を読み出して、これを音声データ生成部
８に与える。音声データ生成部８はステップ２０９にて
前記与えられた数字文字列に対応して読みとアクセント
を前記規則に従って付与することにより、音声データを
生成する。When the voice data generation unit 8 proceeds to step 207, it extracts a numeric character string before and after the detected special character from the Japanese analysis result, and extracts the numeric character string and the special character table. The rule read from 12 is given to the special numeral character processing unit 15 to activate it. As a result, the special numeral character processing unit 15 reads the rule corresponding to the given rule from the data in the special character processing rule table 21 as shown in FIG. It is given to the generation unit 8. The voice data generation unit 8 generates voice data by adding readings and accents in accordance with the rules corresponding to the given number character string in step 209.

【００４０】一方、音声データ生成部８はステップ２０
５に進んだ場合、図１３に示すような特殊パターンテー
ブル１８を参照して、解析結果バッファ７内の日本語解
析結果の中に特殊パターンテーブル１８に登録されてい
る特殊パターンが存在するかどうかをステップ２０５に
て判定し、存在する場合は前記特殊パターンテーブル１
８内の前記検索された特殊パターンに対応するルールを
読み出した後、ステップ２０６に進み、存在しない場合
はステップ２０９に進む。音声データ生成部８はステッ
プ２０６に進んだ場合、前記特殊パターンテーブル１８
から読み出したルールを特殊数字文字処理部２０に与え
ることにより、これに起動をかける。これにより、特殊
数字文字処理部２０はステップ２０８にて図１４に示す
ような構造の特殊パターン処理規則テーブル２２内のデ
ータの中で、前記与えられたルールに対応する規則を読
み出して、これを音声データ生成部８に与える。音声デ
ータ生成部８はステップ２０９にて前記検出された特殊
数字文字列に対応して読みとアクセントを前記規則に従
って付与することにより、音声データを生成する。On the other hand, the voice data generation unit 8 performs step 20.
When the process proceeds to step 5, the special pattern table 18 as shown in FIG. 13 is referred to, and whether or not there is a special pattern registered in the special pattern table 18 in the Japanese analysis result in the analysis result buffer 7. Is determined in step 205, and if it exists, the special pattern table 1
After reading the rule corresponding to the searched special pattern in 8, the process proceeds to step 206, and when it does not exist, the process proceeds to step 209. When the voice data generation unit 8 proceeds to step 206, the special pattern table 18
The rule read out from is given to the special numeric character processing unit 20 to activate it. As a result, in step 208, the special numeric character processing unit 20 reads the rule corresponding to the given rule from the data in the special pattern processing rule table 22 having the structure shown in FIG. It is given to the voice data generation unit 8. The voice data generation unit 8 generates voice data by adding readings and accents in accordance with the rules corresponding to the detected special numeral character string in step 209.

【００４１】次に、上記したステップ２０４〜２０９の
具体的処理内容を説明する。例えば、図９の上段に示す
ような「彼の郵便番号は１２３−４５だ」という文に対
して特殊文字列として「郵便番号」が特殊文字テーブル
１２に登録されていない場合は、数字文字列部分の「１
２３−４５」は「ひゃくに＾じゅう／さん／まいなす／
よ＾んじゅう／ご」という読みが付与されるが、「郵便
番号」が図１０に示すような特殊文字テーブル１２に登
録してある場合には、特殊文字検出部１７により「郵便
番号」が特殊文字として検出され、特殊数字文字処理部
２０に対して起動がかかり、特殊数字文字処理部２０は
特殊文字処理規則テーブル２１を参照して読みの生成を
行なう。この例では特殊文字処理規則テーブル２１には
隣接する数字の読みを棒読みにし、「−」に対しては
「の」という読みを振るという規則が書かれており、こ
の規則に従うと、前記１２３−４５の読みは図９の下段
に示すように「いちに＾い／さんの／よんご＾−だ」と
なる。Next, the specific processing contents of steps 204 to 209 will be described. For example, when "postal code" is not registered in the special character table 12 as a special character string for the sentence "His postal code is 123-45" as shown in the upper part of FIG. Part of "1
"23-45" is "hyaku ni ^ / san / mai eggasu /
Although the reading "Yojuju / go" is added, if the "zip code" is registered in the special character table 12 as shown in FIG. It is detected as a special character, and the special numeric character processing unit 20 is activated, and the special numeric character processing unit 20 refers to the special character processing rule table 21 to generate a reading. In this example, the special character processing rule table 21 has a rule that the adjacent numbers are read as stick readings and the reading of "no" is given to "-". According to this rule, 123- As shown in the lower part of FIG. 9, the reading of 45 is “Ichini ^ i / san's / yogo ^-”.

【００４２】又、図１２の上段に示すような文字列「０
３（１２３）４５６７」が文書中にあった場合、特殊パ
ターン検出部１９が参照する特殊パターンテーブル１８
に該当するパターンが登録されていない時には、「ぜろ
さ＾ん／か＾っこ／ひゃくに＾じゅうさん／か＾っこ／
よんせ＾ん／ごひゃくろくじゅうな＾な」という読みが
付与される。しかし、図１３に示すような構造の特殊パ
ターンテーブル１８に「数字（数字）数字」という登録
がある場合は、特殊パターン検出部１９によりパターン
の照らし合わせが行なわれ、入力文字列が特殊パターン
テーブル１８内のデータにあると、特殊数字文字処理部
２０に対して起動がかかり、特殊数字文字処理部２０は
図１４に示すような特殊文字処理規則テーブル２１を参
照して読みの生成を行なう。この例では、特殊文字処理
規則テーブル２１には隣接する数字の読みを棒読みに
し、「（」「）」に対しては「の」という読みを当てる
いう規則が書かれており、この規則に従うと、入力文字
列に対する読みは「ぜろさ＾んの／いちに＾いさんの／
よんご−ろくな＾な」となる。従って、文字列「０３
（１２３）４５６７」に付与される読みは図１２の下段
に示すようになる。The character string "0" as shown in the upper part of FIG.
3 (123) 4567 ”in the document, the special pattern table 18 referred to by the special pattern detection unit 19
When the pattern that corresponds to is not registered, "Zerosa ^ / / ^ kko / Hyaku ni ^ san / ka ^ kko /
"Yonsei / Goyakurokujana" is added. However, if the special pattern table 18 having the structure as shown in FIG. 13 has a registration of “numeral (numeral) numeral”, the special pattern detection unit 19 collates the patterns, and the input character string becomes the special pattern table. If it is in the data in 18, the special numeric character processing unit 20 is activated, and the special numeric character processing unit 20 refers to the special character processing rule table 21 as shown in FIG. 14 to generate the reading. In this example, the special character processing rule table 21 describes a rule that the adjacent numbers are read as stick readings, and "(") is applied to "(" and ")". , The reading for the input string is "Zeros
Good-good! ” Therefore, the character string "03
The reading given to (123) 4567 "is as shown in the lower part of FIG.

【００４３】同様に、読み上げ対象文書中の「１０：０
０」という文字列に対しては、特殊パターンテーブル１
８に「数字：数字」というパターンの登録がある場合
は、特殊パターン検出部１９により前記文字列とパター
ンの照らし合わせにより、前記「数字：数字」というパ
ターンが検出され、特殊数字文字処理部２０に対して起
動がかかる。これにより、特殊数字文字処理部２０は特
殊パターン処理規則テーブル２２内のデータを参照して
読みの生成を行なう。この例では「：」に対して「じ」
という読みを振り、後ろの数字の後に「ふん」という読
みを振るという規則が書かれており、「じゅ＾うじ／ぜ
ろぜろ＾ふん」という読みになる。尚、この例では、特
殊文字処理規則テーブル２１に記述されている規則によ
って「じゅ＾うじ／ぜろ＾ふん」又は「じゅ＾うじ」と
いう読みを生成することができる。音声データ作成部８
は上記読みに対して音声データをステップ２０９にて作
成することになる。Similarly, "10: 0" in the reading target document.
For the character string "0", the special pattern table 1
If the pattern "Numeric: Numerical" is registered in 8, the special pattern detector 19 detects the pattern "Numeral: Numerical" by comparing the character string with the pattern, and the special numeral character processor 20 To start up. As a result, the special numeral character processing unit 20 refers to the data in the special pattern processing rule table 22 to generate the reading. In this example, ":" stands for ":"
There is a rule that the word "Jun" and the word "Fun" are added after the number in the back. Incidentally, in this example, the reading "ju ^ uji / zero ^ un" or "ju ^ uji" can be generated according to the rules described in the special character processing rule table 21. Voice data creation unit 8
Will generate voice data in response to the above reading in step 209.

【００４４】又、読み上げ対象文書中に例えば「１２．
３４５６」という小数点を含んだ数字文字列がある場
合、特殊パターン検出部１９は前記数字文字列を図１３
に示すような特殊パターンテーブル１８に登録されてい
るパターンと照らし合わせることにより、「数字．数
字」というパターンを検出して、特殊数字文字処理部２
０に対して起動をかける。これにより、特殊数字文字処
理部２０は図１４に示すような特殊パターン処理規則テ
ーブル２２内のデータを参照することにより、読みの生
成を行なう。この例では「．」に対して「てん」という
読みを振り、且つ小数点より上位の数字を桁読みし、小
数点以下の数字を棒読みし、更に、小数点以下の数字が
２文字以上ある場合には、２文字ごとに一纏まりとした
アクセント句を生成するようにして、棒読みするという
規則が書かれている。従って、上記例の小数点以下は、
「さんよ＾ん／ごーろ＾く」という読みになる。結局、
この例では、特殊文字処理規則テーブル２１に記述され
ている規則によって「じゅ＾うにてん／さんよ＾ん／ご
ーろ＾く」という読みが生成される。音声データ作成部
８はこの読みに対して音声データをステップ２０９にて
作成することになる。In the reading target document, for example, "12.
If there is a numerical character string including a decimal point such as “3456”, the special pattern detection unit 19 converts the numerical character string into the one shown in FIG.
The pattern "number.number" is detected by comparing it with the pattern registered in the special pattern table 18 as shown in FIG.
Activate for 0. As a result, the special numeric character processing unit 20 refers to the data in the special pattern processing rule table 22 as shown in FIG. 14 to generate the reading. In this example, "." Is read for ".", The digits above the decimal point are digit-read, the digits below the decimal point are stick-read, and if there are two or more digits below the decimal point, A rule is written that stick reading is performed so that a set of accent phrases is generated for every two characters. Therefore, after the decimal point in the above example,
It will be read as "Sanyo ^ / Goro ^ ku". After all,
In this example, the reading "Ju Uden / Sanyo ^ / Goro ^ ku" is generated according to the rule described in the special character processing rule table 21. The voice data creation unit 8 will create voice data for this reading in step 209.

【００４５】しかし、上記したような数字パターンで
は、小数点以下の数字が２の倍数になるとは限らないた
めに、上記のように２文字単位で纏めると、余りが生じ
る場合がある。例えば「１２．３４５６７」という数字
文字列では、小数点以下を２文字単位で纏めると、７が
余ってしまう。このような場合は最後の３文字を一纏ま
りとしたアクセント句として音声データを生成し、「さ
んよ＾ん／ごーろ＾くな＾な」という読みを生成すれば
よいことが、このことは規則として、特殊パターン処理
規則テーブル２２内に図１４に示すように書き込まれて
いる。尚、小数点以下の数字が２の倍数にならない場合
の措置として、上記とは別に、最後の１文字を独立した
アクセント句として、読みを生成し、「さんよ＾ん／ご
ーろ＾く／な＾な」という読みを生成してもよいが、図
１４に示した本例では、前者の例を採用している。However, in the numerical pattern as described above, since the numbers after the decimal point are not always a multiple of 2, there is a case where a remainder occurs when the numerical patterns are grouped in units of 2 as described above. For example, in the numerical character string "12.34567", if the decimal places are combined in units of two characters, 7 will be left over. In such a case, it is only necessary to generate voice data as an accent phrase in which the last three characters are grouped together, and to generate a reading of “sanyo ^ / guro ^ kunana ^ na”. Is written as a rule in the special pattern processing rule table 22 as shown in FIG. As a measure when the number after the decimal point does not become a multiple of 2, apart from the above, the last one character is used as an independent accent phrase to generate a reading, and "Sanyo ^ / Goro ^ ku / However, the former example is adopted in this example shown in FIG.

【００４６】ここで、特殊数字文字処理部２０に送られ
る数字文字列として特殊文字検出部１７と特殊パターン
検出部１９の両方に適合する文字列がある場合は、特殊
文字検出部１７の規則を優先する。例えば「その比率は
１０：１３である。」という文に関して、数字文字列
「１０：１３」は特殊パターンテーブル１８内のデータ
に「数字：数字」という登録がある場合は、特殊パター
ン検出部１９によりパターンの照らし合わせが行なわ
れ、入力文字列が特殊パターンデータと一致すると、特
殊数字文字処理部２０に対して起動がかかり、特殊数字
文字処理部２０は特殊パターン処理規則テーブル２２の
データを参照して読みの生成を行なう。この例で
は「：」に対して「じ」という読みを振り、後ろの数字
の後に「ふん」という読みをたすという規則が書かれて
おり、「じゅ＾うじ／じゅうさ＾んぷん」という読みに
なる。音声データ作成部８は上記読みに対して音声デー
タをステップ２０９にて作成することになる。If there is a character string that matches both the special character detection unit 17 and the special pattern detection unit 19 as a numeric character string sent to the special numeric character processing unit 20, the rules of the special character detection unit 17 are set. Prioritize. For example, regarding the sentence “the ratio is 10:13”, if the numeric character string “10:13” is registered in the data in the special pattern table 18 as “number: number”, the special pattern detection unit 19 When the input character string matches the special pattern data, the special numeric character processing unit 20 is activated, and the special numeric character processing unit 20 refers to the data of the special pattern processing rule table 22. Then, the reading is generated. In this example, the rule is to put the reading "ji" for ":" and add the reading "fun" after the number after it, which is called "ju ^ uji / jusa ^ mpun". Be read. The voice data creation unit 8 will create voice data for the above reading in step 209.

【００４７】しかし、特殊文字テーブル１２に「比率」
が登録されている場合は特殊文字検出部１７により前記
「比率」が特殊文字テーブル１２から特殊文字として検
出され、特殊数字文字処理部２０に対して起動がかか
り、特殊数字文字処理部２０は特殊文字処理規則テーブ
ル２１を参照して、読みの生成を行なう。この例では特
殊文字処理規則テーブル２１に、隣接する数字の読みを
棒読みにし、「：」に対しては「たい」という読みを振
るという規則が書かれており、この規則に従うと「１
０：１３」に対する読みは「じゅった＾い／じゅ＾うさ
ん」となり、全体の読みは「そのひりつは／じゅうた＾
い／じゅ＾うさんで／あ＾る」という読みが得られる。
このような場合、前述したように特殊文字検出部１７の
検出により得られた読みが優先するため、音声データ生
成部８は「そのひりつは／じゅうた＾い／じゅ＾うさん
で／あ＾る」という読みを採用し、これら読みに対する
音声データをステップ２０９にて作成することになる。However, the "ratio" is displayed in the special character table 12.
If is registered, the special character detection unit 17 detects the “ratio” as a special character from the special character table 12, and the special numeric character processing unit 20 is activated. The reading is generated with reference to the character processing rule table 21. In this example, the special character processing rule table 21 describes a rule that the adjacent numbers are read as stick readings and "tai" is read for ":".
The reading for "0:13" will be "Jutta ^ i / Ju ^ san", and the whole reading will be "the secret
You can get the reading "I / Ju U-san / Aru".
In such a case, as described above, the reading obtained by the detection by the special character detection unit 17 is prioritized, so that the voice data generation unit 8 is "the secret is / 13 ^ / / The readings "Ru" are adopted, and voice data for these readings is created in step 209.

【００４８】尚、上記した特殊文字の検出と、特殊パタ
ーンの検出とが同一の数字文字列に対して同時に発生し
た場合の優先順位は入力装置２から制御部３に予め設定
できるようになっており、制御部３がこの優先順位を音
声データ生成部８にセットすることにより、上記処理が
行われるようになっている。従って、制御部３に前記優
先順位を逆に設定しておけば、特殊パターン検出部１９
の検出による音声データ生成処理が優先されて、実行さ
れることになる。When the special character detection and the special pattern detection described above occur simultaneously for the same numeric character string, the priority order can be set in advance from the input device 2 to the control unit 3. Therefore, the control unit 3 sets this priority in the voice data generation unit 8 so that the above processing is performed. Therefore, if the priorities are set in the control unit 3 in reverse, the special pattern detection unit 19
The voice data generation process by the detection of is prioritized and executed.

【００４９】音声データ生成部８は解析結果バッファ７
内の日本語解析結果データの特殊文字又は特殊パターン
でない部分については、ステップ２０８にて音声データ
生成規則ファイル９を参照してから音声データを生成
し、前記特殊文字又は特殊パターンについてはこれらに
当てられた読みに対して音声データを生成して、音声デ
ータバッファ１０に格納する。音声データ生成部８は音
声データの作成が終了すると、音声データの作成が終了
したことを示す信号を制御部３に対して送る。The voice data generator 8 has an analysis result buffer 7
For the portion of the Japanese analysis result data that is not a special character or a special pattern, the voice data is generated after referring to the voice data generation rule file 9 in step 208, and the special character or the special pattern is applied to these. Voice data is generated for the read and stored in the voice data buffer 10. When the creation of the audio data is completed, the audio data generation unit 8 sends a signal indicating that the creation of the audio data is completed to the control unit 3.

【００５０】ここで、音声データ生成規則の一部を図７
に示し、出力される音声データを図８に示す。図７は、
五段動詞でアクセントの形がＯ型でない場合で、その活
用形が未然形の場合はそのアクセントの形をＯ型にする
という規則の例である。又、図８には音声データファイ
ル１０のフォーマットを示す。但し、読み上げ文字列デ
ータにおいてカタカナ文字は音声データを表し「＾」は
アクセントの位置を表し「．」は設定バッファ５に設定
されている長さの休みを表す。Here, a part of the voice data generation rule is shown in FIG.
The output voice data shown in FIG. FIG.
This is an example of a rule in which the accent form is O-type when the accent form is not O-type in the five-verb and the inflectional form is pre-formed. Further, FIG. 8 shows the format of the audio data file 10. However, in the reading character string data, katakana characters represent voice data, “^” represents the position of accent, and “.” Represents a break of the length set in the setting buffer 5.

【００５１】音声データ作成が終了したことを示す信号
を受けた制御部３は音声データファイル１０内の音声デ
ータを音声合成装置１１に渡す。音声合成装置１１はス
テップ２１０にて設定バッファ５内に設定された速度、
音質、高さ、強さにより前記音声データを電気信号に変
換することにより、スピーカ等の音声出力部１３から音
声を出力する。ここで、音声合成装置１１は音韻例と、
特殊制御コードからなるデータを入力すると、これを電
気的な音声信号に変換する装置であり、前記データは音
声データファイル１０内の音声データと同じ形式をとっ
ている。即ち、音声合成装置１１は音声データファイル
１０のフォーマットの文字列を受けると、音声の規則合
成を行なえる装置とも言え、指定された速度、音質、高
さ、強さによりそれに続いて送られてくる文字列に対し
て規則合成を行なう。Upon receiving the signal indicating that the voice data creation is completed, the control unit 3 transfers the voice data in the voice data file 10 to the voice synthesizer 11. The voice synthesizer 11 sets the speed set in the setting buffer 5 in step 210,
A voice is output from the voice output unit 13 such as a speaker by converting the voice data into an electric signal according to sound quality, height, and strength. Here, the speech synthesizer 11 includes a phoneme example,
It is a device for converting data, which is composed of a special control code, into an electric voice signal, and the data has the same format as the voice data in the voice data file 10. That is, when the voice synthesizer 11 receives a character string in the format of the voice data file 10, it can be said that the voice synthesizer 11 is capable of performing regular voice synthesis. Rule composition is performed on the incoming character string.

【００５２】図２に示したステップ２０３〜２１０一連
の処理により文書の読み上げが行なわれる。制御部１３
はステップ２１１にて読み上げ対象文書が最後まで読み
上げられたかを判定し、読み上げられていない場合は、
ステップ２１２を経てステップ２０３に戻り、読み上げ
が終了した場合は処理を終了する。The document is read aloud by a series of steps 203 to 210 shown in FIG. Control unit 13
Determines in step 211 whether the reading target document has been read to the end, and if it has not been read,
After step 212, the process returns to step 203, and when the reading is finished, the process is finished.

【００５３】本実施例によれば、日本語解析結果から音
声データを作成する際に、電話番号や郵便番号等のよう
な桁読みでない特殊な読み上げ方をする数字文字列と通
常の桁読みを行う数字文字列とを識別し、前記特殊な読
み上げ方をする数字文字列に対しては、この数字文字列
のパーターン、又はこの数字文字列の前後にある前記電
話番号や郵便番号等のような特殊文字によって、その文
脈や場面に相応しい読みを当てることができるため、上
記のような数字文字列に対して常に適切な読み上げを行
うことができる。これにより、「（、）」や「−」等の
特殊な文字に条件を満たす数字が隣接している場合に、
「ぜろさんの／いちにいさんの／よんご−ろくな
な」等のように、電話番号や郵便番号の読み上げに適し
た自然な文書読み上げを実現でき、文書読み上げ時に聴
取者に違和感を与えたり、又意味を取り違えるような読
み方をなくして、文書読上装置の性能を向上させること
ができる。According to the present embodiment, when the voice data is created from the Japanese analysis result, the numeric character string and the normal digit reading which are not digit-reading such as telephone numbers and postal codes are used. The numeric character string to be performed is identified, and for the numeric character string to be read aloud in a special way, the pattern of this numeric character string, or the telephone number or zip code before and after this numeric character string, etc. Since the special characters can give a reading suitable for the context or scene, it is possible to always read appropriately the numerical character string as described above. As a result, when special characters such as "(,)" and "-" are adjacent to numbers that meet the conditions,
"Zerosan's / Ichinii's / Japanese-Good
, Etc., it is possible to realize natural reading of documents suitable for reading phone numbers and postal codes, and to eliminate the possibility of giving a sense of discomfort to the listener when reading the document or making a mistake in the meaning of reading the document. The performance of the device can be improved.

【００５４】[0054]

【発明の効果】以上記述した如く請求項１又は６の発明
によれば、前記数字文字列から前記特殊単語で限定され
る場面又は文脈に合った読みの音声データを作成するこ
とができ、前記数字文字列の自然な読み上げを行うこと
ができる。As described above, according to the invention of claim 1 or 6, it is possible to create reading voice data that matches the scene or context limited by the special word from the numeric character string. You can read a number string naturally.

【００５５】請求項２又は９の発明によれば、前記特殊
パターンの数字文字列で限定される場面又は文脈に合っ
た読みの音声データを作成することができ、前記数字文
字列の自然な読み上げを行うことができる。According to the second or ninth aspect of the invention, it is possible to create the voice data of the reading that matches the scene or context limited by the numeric character string of the special pattern, and read the numeric character string naturally. It can be performed.

【００５６】請求項３又は１２の発明によれば、前記数
字文字列から前記特殊単語で限定される場面又は文脈に
合った読みの音声データを作成したり、或いは前記数字
文字列からこの数字文字列が有するパターンで限定され
る場面又は文脈に合った読みの音声データを作成するこ
とができ、前記数字文字列の自然な読み上げを行うこと
ができる。According to the third or twelfth aspect of the invention, reading voice data suitable for the scene or context limited by the special word is created from the numeric character string, or the numeric character string is used to read the numeric character. It is possible to create reading voice data that matches the scene or context limited by the pattern of the string, and perform natural reading of the numeric character string.

【００５７】請求項４又は１３の発明によれば、前記数
字文字列から前記特殊単語又はこの数字文字列が有する
パターンのどちらか一方で限定される場面又は文脈に合
った読みの音声データを作成することができ、前記数字
文字列の自然な読み上げを行うことができる。According to the fourth or thirteenth aspect of the present invention, the reading voice data suitable for the scene or the context, which is limited to either the special word or the pattern of the numeric character string, is created from the numeric character string. It is possible to read the numeric character string naturally.

【００５８】請求項５の発明によれば、前記数字文字列
から前記特殊単語で限定される場面又は文脈に合った読
みの音声データを作成することができ、前記数字文字列
の自然な読み上げを行うことができる。According to the invention of claim 5, it is possible to create reading voice data that matches the scene or context limited by the special word from the numeric character string, and read the numeric character string naturally. It can be carried out.

【００５９】請求項７の発明によれば、前記特殊単語の
検索を確実に行うことができる。According to the invention of claim 7, it is possible to surely search for the special word.

【００６０】請求項８の発明によれば、特殊単語に対応
した規則に従った読みを確実に得ることができる。According to the invention of claim 8, the reading according to the rule corresponding to the special word can be surely obtained.

【００６１】請求項１０の発明によれば、パターン数字
文字列の検索を確実に行うことができる。According to the tenth aspect of the invention, it is possible to surely search the pattern numeral character string.

【００６２】請求項１１の発明によれば、特殊パターン
の数字文字列に対応した規則に従った読みを確実に得る
ことができる。According to the eleventh aspect of the present invention, it is possible to surely obtain the reading according to the rule corresponding to the numeric character string of the special pattern.

[Brief description of drawings]

【図１】本発明の文書読上装置の一実施例を示したブロ
ック図。FIG. 1 is a block diagram showing an embodiment of a document reading apparatus according to the present invention.

【図２】図１に示した装置の文書読み上げ処理を示した
フローチャート。FIG. 2 is a flowchart showing a document reading process of the apparatus shown in FIG.

【図３】図１に示した文書データファイル内の文書デー
タの一部を示した図。FIG. 3 is a diagram showing a part of the document data in the document data file shown in FIG.

【図４】図１に示した設定バッファの内容例を示した
図。FIG. 4 is a diagram showing an example of contents of a setting buffer shown in FIG.

【図５】図１に示した単語辞書の構造例を示した図。5 is a diagram showing an example of the structure of the word dictionary shown in FIG.

【図６】図１に示した解析結果バッファの内容例を示し
た図。6 is a diagram showing an example of contents of an analysis result buffer shown in FIG.

【図７】図１に示した音声データ生成規則ファイル内の
データ例を示した図。7 is a diagram showing an example of data in the audio data generation rule file shown in FIG.

【図８】図１に示した音声データファイル内の音声デー
タの一例を示した図。FIG. 8 is a diagram showing an example of audio data in the audio data file shown in FIG.

【図９】文書中の特殊文字例と特殊文字処理規則の適用
により生成された音声データ例を示した図。FIG. 9 is a diagram showing an example of special character in a document and audio data generated by applying a special character processing rule.

【図１０】図１に示した特殊文字テーブル内の特殊文字
例を示した図。10 is a diagram showing an example of special characters in the special character table shown in FIG.

【図１１】図１に示した特殊文字処理規則バッファ内の
特殊文字処理規則例を示した図。11 is a diagram showing an example of special character processing rules in a special character processing rule buffer shown in FIG.

【図１２】文書中の特殊パターン例と特殊パターン処理
規則の適用により生成された音声データの例を示した
図。FIG. 12 is a view showing an example of a special pattern in a document and an example of audio data generated by applying a special pattern processing rule.

【図１３】図１に示した特殊パターンテーブル内の特殊
パターン例を示した図。13 is a diagram showing an example of a special pattern in the special pattern table shown in FIG.

【図１４】図１に示した特殊パターン処理規則バッファ
内の特殊パターン処理規則例を示した図。14 is a diagram showing an example of a special pattern processing rule in the special pattern processing rule buffer shown in FIG.

[Explanation of symbols]

１…文書データファイル２…入力装置３…制御部４…日本語解析
部５…設定バッファ６…単語辞書７…解析結果バッファ８…音声データ
生成部９…音声データ生成規則ファイル１０…音声デー
タファイル１１…音声合成装置１２…特殊文字
テーブル１３…音声出力部１４…表示装置１５…表示データ生成部１６…表示デー
タファイル１７…特殊文字検出部１８…特殊パタ
ーンテーブル１９…特殊パターン検出部２０…特殊数字
文字処理部２１…特殊文字処理規則テーブル２２…特殊パタ
ーン処理規則テーブル1 ... Document data file 2 ... Input device 3 ... Control unit 4 ... Japanese analysis unit 5 ... Setting buffer 6 ... Word dictionary 7 ... Analysis result buffer 8 ... Voice data generation unit 9 ... Voice data generation rule file 10 ... Voice data file 11 ... Voice synthesizer 12 ... Special character table 13 ... Voice output unit 14 ... Display device 15 ... Display data generation unit 16 ... Display data file 17 ... Special character detection unit 18 ... Special pattern table 19 ... Special pattern detection unit 20 ... Special Numerical character processing unit 21 ... Special character processing rule table 22 ... Special pattern processing rule table

───────────────────────────────────────────────────── フロントページの続き (51)Int.Cl.⁶ 識別記号庁内整理番号ＦＩ技術表示箇所 // Ｇ０６Ｆ 17/28 ─────────────────────────────────────────────────── ─── Continuation of the front page (51) Int.Cl. ⁶ Identification number Office reference number FI technical display location // G06F 17/28

Claims

[Claims]

1. A voice data generation method for generating voice data according to a voice data generation rule from analysis results obtained by subjecting document data in a form that can be handled on a computer to Japanese analysis. When a predetermined special word is detected, by applying a reading according to the rule corresponding to the detected special word to the numeric character string existing before and after the special word, from the numeric character string. A voice data generation method characterized by generating voice data.

2. A voice data generation method for generating voice data according to a voice data generation rule from a analysis result obtained by subjecting document data in a form that can be handled by a computer to Japanese analysis. When a numeric character string of a predetermined special pattern is detected, the numeric character string is read according to a rule corresponding to the pattern to generate voice data from the numeric character string. Data generation method.

3. A voice data generation method for generating voice data according to a voice data generation rule from a analysis result obtained by subjecting document data in a form that can be handled by a computer to Japanese analysis. When a predetermined special word is detected, by applying a reading according to the rule corresponding to the detected special word to the numeric character string existing before and after the special word, from the numeric character string. When the voice data is generated or a numeric character string of a predetermined special pattern is detected from the Japanese analysis result, the numeric character string is read according to a rule corresponding to the pattern to obtain the numeric character. A voice data generation method characterized by generating voice data from a character string.

4. The voice data generating method according to claim 3, wherein the detection of the special word and the detection of the numeric character string of the special pattern occur simultaneously for the same numeric character string,
A voice data generation method characterized by performing voice data generation processing started from detection of a preset higher priority, and generating voice data from a corresponding numeric character string.

5. The voice data generation method according to claim 4, wherein the voice data processing activated by the detection of the special word is prioritized.

6. A voice data is generated according to a voice data generation rule from an analysis result obtained by performing Japanese analysis on document data to be read out, and this voice data is converted into an electric voice signal by a voice synthesizer to obtain the voice data. In a document reading device that outputs the obtained voice signal as a voice by a voice output device to the outside, a special word detecting unit that detects a predetermined special word from the Japanese analysis result and a special word detecting unit that detects the special word. Extraction means for extracting a numeric character string existing before and after the special word, and the numeric character string extracted by this extraction means is predetermined in correspondence with the special word detected by the detection means. A document reading apparatus, comprising: voice data generating means for generating voice data from the numeric character string by applying reading according to the rules.

7. The document reading apparatus according to claim 6,
A document characterized in that the special word detecting means has table data in which a plurality of special words are listed, and the special words are searched by matching words appearing in the Japanese analysis result with the table data. Reading device.

8. The document reading apparatus according to claim 6 or 7, wherein said voice data generation means has table data in which a list of rules determined corresponding to each of said plurality of special words is listed, and said extraction is performed. In the numerical character string extracted by means,
A document reading apparatus, which generates voice data from the numeric character string by applying a reading according to a rule in the table data corresponding to the special word detected by the detecting means.

9. A voice data is generated according to a voice data generation rule from an analysis result obtained by analyzing the document data to be read out in Japanese, and the voice data is converted into an electric voice signal by a voice synthesizer to obtain the voice data. In a document reading device that outputs the generated voice signal to the outside as a voice by a voice output device, special pattern detecting means for detecting a numeric character string of a predetermined special pattern from the Japanese analysis result, and the special pattern. A voice data generating means for generating voice data from the numeric character string by applying reading according to a predetermined rule corresponding to the special pattern to the numeric character string of the special pattern detected by the detecting means. A document reading device characterized by being provided.

10. The document reading apparatus according to claim 9, wherein the special pattern detection means has table data listing a plurality of special pattern numeric character strings, and a numeric character string appearing in the Japanese analysis result is stored. A document reading device characterized by searching the special word by collating with the table data.

11. The document reading apparatus according to claim 9 or 10, wherein said voice data generating means has table data in which a list of rules determined corresponding to each of the plurality of special pattern numeric character strings is provided. Then, by applying reading according to the rule in the table data corresponding to the special pattern to the numeric character string of the special pattern detected by the detecting means, voice data is generated from the numeric character string. Document reading device.

12. A voice data is generated according to a voice data generation rule from an analysis result obtained by analyzing Japanese of document data to be read aloud, and this voice data is converted into an electric voice signal by a voice synthesizer to obtain the voice data. In a document reading device that outputs the obtained voice signal as a voice by a voice output device to the outside, a special word detecting unit that detects a predetermined special word from the Japanese analysis result and a special word detecting unit that detects the special word. Extracting means for extracting a numeric character string existing before and after the special word, and the numeric character string extracted by the extracting means, corresponding to the special word detected by the special word detecting means, in advance. A first voice data generating means for generating voice data from the numerical character string by applying a reading according to a predetermined rule, and a Japanese language analysis result for predicting the voice data. A special pattern detecting means for detecting a numeric character string of a special pattern defined by the special pattern, and a predetermined rule corresponding to the special pattern of the numeric character string detected by the special pattern detecting means. And a second voice data generating means for generating voice data from the numerical character string by applying the reading according to the document reading device.

13. The document reading apparatus according to claim 12, wherein the voice data generating process activated by the detection of the special word by the special word detecting unit and the detection of the numeric character string of the special pattern by the special pattern detecting unit. By the setting means for setting which of the voice data generation processing started by the priority character string and the numeric character string extracted by the extracting means and the numeric character string detected by the special pattern detecting means are coincident with each other. And a control means for executing a voice data generation process according to the priority information set in the setting means when the judgment means judges that the two numeric character strings match. Document reading device characterized by.