JPH08129398A

JPH08129398A - Text analysis device

Info

Publication number: JPH08129398A
Application number: JP6268853A
Authority: JP
Inventors: Tadahiro Hoshino; 恭祐星野; 勝美 ▲高▼橋; Katsumi Takahashi; Mitsuji Matsushita; 満次松下
Original assignee: Oki Electric Industry Co Ltd
Current assignee: Oki Electric Industry Co Ltd
Priority date: 1994-11-01
Filing date: 1994-11-01
Publication date: 1996-05-21

Abstract

PURPOSE: To provide a text analysis device in which two opposing requirements, i.e., the expansion of a word dictionary and the prevention of the speed reduction in a text analysis processing are satisfied. CONSTITUTION: A fixed dictionary storage section 23 and a user dictionary storage section 24 have plural word storage regions 23A, 23B, 23C,..., 24A, 24B, 24C,..., respectively corresponding to plural categories A, B,... A control section 12 of a text data generation device 10 transfers the commands, which specify the categories to be used, together with text data. A control section 22 of a text audio conversion device 20 extracts the commands from the received data. A category selection section 25 selects the category specified by the extracted command. The section 22 analyzes the text data employing the words which belong to the selected category.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】この発明は、例えば、テキストデ
ータを音声波形データに変換するためのテキスト音声変
換システムにおいて、単語辞書を使ってテキストデータ
を解析するテキスト解析装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a text analysis device for converting text data into voice waveform data, for analyzing text data by using a word dictionary.

【０００２】[0002]

【従来の技術】近年、音声合成システムとして、出力語
彙の制限がないこと等から、テキスト音声変換システム
が注目されている。このテキスト音声変換システムは、
テキストデータ、すなわち、文字コード列を音声波形デ
ータに変換することにより、文章を音声で読み上げるこ
とができるようにするものである。2. Description of the Related Art In recent years, a text-to-speech conversion system has been attracting attention as a speech synthesis system because it has no limitation on output vocabulary. This text-to-speech system
By converting text data, that is, a character code string into voice waveform data, a sentence can be read aloud.

【０００３】テキストデータを音声波形データに変換す
るテキスト音声変換処理は、テキスト解析処理と、合成
パラメータ生成処理と、音声合成処理に大別される。Text-to-speech conversion processing for converting text data into speech waveform data is roughly classified into text analysis processing, synthesis parameter generation processing, and speech synthesis processing.

【０００４】ここで、テキスト解析処理とは、単語辞書
を使ってテキストデータを解析し、テキストデータの読
み、アクセント、イントネーション等が記述された音韻
韻律記号列を生成する処理である。Here, the text analysis process is a process of analyzing text data using a word dictionary and generating a phonological prosodic symbol string in which reading, accent, intonation, etc. of the text data are described.

【０００５】合成パラメータ生成処理とは、テキスト解
析処理によって得られた音韻韻律記号列から、音声素
片、各音韻の継続時間長およびピッチ（声の高さ）、振
幅（声の大きさ）の時間変化パターンなど、音声合成に
必要なパラメータを生成する処理である。The synthesis parameter generation process is based on the phoneme prosodic symbol string obtained by the text analysis process, and includes speech units, durations and pitches (voice pitches), and amplitudes (voice volume) of each phoneme. This is a process of generating parameters necessary for speech synthesis, such as a temporal change pattern.

【０００６】音声合成処理とは、合成パラメータ生成処
理により生成された合成パラメータから音声波形データ
を生成する処理である。The voice synthesizing process is a process of generating voice waveform data from the synthesis parameter generated by the synthesis parameter generating process.

【０００７】上記単語辞書は、通常、固定辞書とユーザ
辞書からなる。ここで、固定辞書とは、標準的な単語が
予め登録されている辞書である。これに対し、ユーザ辞
書とは、ユーザが自由に単語を登録することができる辞
書である。The word dictionary generally includes a fixed dictionary and a user dictionary. Here, the fixed dictionary is a dictionary in which standard words are registered in advance. On the other hand, the user dictionary is a dictionary in which the user can freely register words.

【０００８】このようにユーザ辞書を設ける構成によれ
ば、テキストデータに固定辞書に登録されていない単語
が含まれている場合であっても、この単語をユーザ辞書
に登録することにより、テキストデータを正確に解析す
ることができる。According to the structure in which the user dictionary is provided as described above, even if the text data includes a word that is not registered in the fixed dictionary, by registering this word in the user dictionary, the text data can be stored. Can be accurately analyzed.

【０００９】[0009]

【発明が解決しようとする課題】しかしながら、上述し
た従来のテキスト音声変換システムにおいては、次のよ
うな問題があった。However, the above-described conventional text-to-speech conversion system has the following problems.

【００１０】すなわち、単語辞書は、様々なテキストデ
ータに対応するため、一般に、広く、浅く編集されてい
る。しかしながら、このような構成では、特定分野の専
門用語をカバーすることができないため、この特定分野
のテキストデータを正確に解析することができない場合
がある。That is, the word dictionary is generally wide and shallowly edited in order to correspond to various text data. However, such a configuration may not be able to cover technical terms in a specific field, and thus text data in the specific field may not be accurately analyzed.

【００１１】この問題を解決するためには、単語辞書を
広く、深く編集すればよい。すなわち、単語辞書の拡充
を図ればよい。しかしながら、このようにすると、単語
辞書が大きくなりすぎて、テキストデータと単語辞書と
の照合時間が長くなる。その結果、テキストデータを入
力してから音声が出力されるまでの間にタイムラグが生
じてしまう。To solve this problem, the word dictionary may be edited wide and deep. That is, the word dictionary should be expanded. However, in this case, the word dictionary becomes too large and the matching time between the text data and the word dictionary becomes long. As a result, there is a time lag between the input of text data and the output of voice.

【００１２】以上から、テキスト音声変換システムにお
いては、タイムラグを招くことなく、特定分野のテキス
トデータを正確に解析することができるテキスト解析装
置が望まれる。言い換えれば、単語辞書の拡充とテキス
ト解析処理の処理速度の低下防止という相反する２つの
要求を満足することができるテキスト解析装置が望まれ
る。From the above, in the text-to-speech conversion system, a text analysis device capable of accurately analyzing text data in a specific field without causing a time lag is desired. In other words, there is a demand for a text analysis device capable of satisfying two contradictory requirements of expansion of a word dictionary and prevention of reduction in processing speed of text analysis processing.

【００１３】[0013]

【課題を解決するための手段】上記課題を解決するため
に、この発明は、単語分類用の複数のカテゴリのそれぞ
れに対応する複数の単語記憶領域を有し、各単語記憶領
域に、対応するカテゴリに属する単語が格納されている
手段と、複数のカテゴリの中から、テキストデータの解
析に使用するカテゴリを選択する手段と、選択されたカ
テゴリに対応する単語記憶領域に格納されている単語を
使って、テキストデータを解析する手段とを設けるよう
にしたものである。In order to solve the above problems, the present invention has a plurality of word storage areas respectively corresponding to a plurality of word classification categories and corresponds to each word storage area. A method for storing words belonging to a category, a method for selecting a category used for analyzing text data from a plurality of categories, and a word stored in a word storage area corresponding to the selected category And means for analyzing the text data.

【００１４】[0014]

【作用】上記構成においては、テキスト解析処理の実行
に先立って、解析に使用されるカテゴリが選択される。
次に、選択されたカテゴリに対応する単語記憶領域に格
納されている単語を使って、テキストデータの解析がな
される。In the above structure, the category used for the analysis is selected prior to the execution of the text analysis process.
Next, the text data is analyzed using the words stored in the word storage area corresponding to the selected category.

【００１５】このような構成によれば、単語辞書のう
ち、解析に必要な単語が登録されている部分だけを使っ
て、テキストデータを解析することができる。したがっ
て、辞書を広く、深く編集したとしても、テキスト解析
処理の処理速度が低下することを防止することができ
る。その結果、単語辞書の拡充とテキスト解析処理の処
理速度の低下防止という相反する２つの要求を満足する
ことができる。With such a configuration, the text data can be analyzed using only the portion of the word dictionary in which the words required for analysis are registered. Therefore, even if the dictionary is edited wide and deep, it is possible to prevent the processing speed of the text analysis processing from decreasing. As a result, it is possible to satisfy two contradictory requirements of expansion of the word dictionary and prevention of reduction in the processing speed of the text analysis processing.

【００１６】[0016]

【実施例】以下、図面を参照しながら、この発明の実施
例を詳細に説明する。なお、以下の説明では、この発明
をテキスト音声変換システムのテキスト解析装置に適用
する場合を代表として説明する。Embodiments of the present invention will now be described in detail with reference to the drawings. In the following description, the case where the present invention is applied to a text analysis device of a text-to-speech conversion system will be described as a representative.

【００１７】（１）第１の実施例まず、この発明の第１の実施例を詳細に説明する。(1) First Embodiment First, a first embodiment of the present invention will be described in detail.

【００１８】（１−１）第１の実施例の概要この実施例は、単語を単語辞書に登録する際、ある
カテゴリに従って分類して登録し、テキストデータを解
析する際、複数のカテゴリの中から使用するカテゴリを
選択し、選択されたカテゴリに属する単語を使って、解
析するようにしたものである。(1-1) Outline of First Embodiment In this embodiment, when words are registered in a word dictionary, they are classified and registered according to a certain category, and when text data is analyzed, a plurality of categories are selected. The category to be used is selected from, and analysis is performed using the words belonging to the selected category.

【００１９】また、この実施例は、使用するカテゴ
リを選択する際、カテゴリを指定する情報をテキストデ
ータと一緒に出力し、この出力からカテゴリ指定情報を
抽出し、この抽出されたカテゴリ指定情報に基づいて、
使用するカテゴリを選択するようにしたものである。Further, in this embodiment, when selecting a category to be used, the information designating the category is output together with the text data, the category designating information is extracted from this output, and the extracted category designating information is added to the extracted category designating information. On the basis of,
The category to be used is selected.

【００２０】さらに、この実施例は、選択されたカ
テゴリに属する単語を使って、テキストデータを解析す
る際、予め定めた優先順位に従って、選択されたカテゴ
リに属する単語を参照するようにしたものである。Furthermore, in this embodiment, when the text data is analyzed using the words belonging to the selected category, the words belonging to the selected category are referred to in accordance with a predetermined priority order. is there.

【００２１】（１−２）第１の実施例の構成図１は、この発明の第１の実施例の構成を示すブロック
図である。なお、図１には、この発明のテキスト解析装
置を備えたテキスト音声変換システム全体の構成を示
す。(1-2) Configuration of the First Embodiment FIG. 1 is a block diagram showing the configuration of the first embodiment of the present invention. Note that FIG. 1 shows the overall configuration of a text-to-speech conversion system equipped with the text analysis device of the present invention.

【００２２】図において、１０は、テキストデータを発
生するテキストデータ発生装置である。このテキストデ
ータ発生装置１０は、例えば、パーソナルコンピュータ
あるいはワードプロセッサから構成されている。２０
は、このテキストデータ発生装置１０から出力されるテ
キストデータを音声波形データに変換するテキスト音声
変換装置である。In the figure, 10 is a text data generator for generating text data. The text data generator 10 is composed of, for example, a personal computer or a word processor. 20
Is a text-to-speech converter for converting the text data output from the text data generator 10 into speech waveform data.

【００２３】テキストデータ発生装置１０において、１
１は、複数のテキストファイルが格納されているファイ
ル記憶部である。１２は、テキスト音声変換装置２０と
のインタフェース制御、テキストデータの読出し処理等
を実行する制御部である。１３は、テキスト音声変換装
置２０との間で、データ等をやりとりするデータ入出力
部である。In the text data generator 10, 1
Reference numeral 1 denotes a file storage unit that stores a plurality of text files. Reference numeral 12 denotes a control unit that executes interface control with the text-to-speech conversion device 20, read processing of text data, and the like. A data input / output unit 13 exchanges data and the like with the text-to-speech conversion device 20.

【００２４】テキスト音声変換御装置２０において、２
１は、テキストデータ発生装置１０との間で、データ等
のやりとりを行うデータ入出力部である。２２は、テキ
ストデータ発生装置１０とのインタフェース制御、テキ
スト解析処理、合成パラメータ生成処理等を実行する制
御部である。In the text-to-speech conversion device 20, 2
Reference numeral 1 is a data input / output unit for exchanging data and the like with the text data generator 10. A control unit 22 executes interface control with the text data generator 10, text analysis processing, synthesis parameter generation processing, and the like.

【００２５】２３は、固定辞書が格納される固定辞書記
憶部である。この固定辞書記憶部２３は、例えば、読出
し専用メモリで構成されている。２４は、ユーザ辞書が
格納されるユーザ辞書記憶部である。このユーザ辞書記
憶部２４は、例えば、ランダムアクセスメモリで構成さ
れている。Reference numeral 23 is a fixed dictionary storage unit for storing a fixed dictionary. The fixed dictionary storage unit 23 is composed of, for example, a read-only memory. Reference numeral 24 is a user dictionary storage unit that stores a user dictionary. The user dictionary storage unit 24 is composed of, for example, a random access memory.

【００２６】２５は、テキスト解析処理に使用されるカ
テゴリを選択するカテゴリ選択部である。２６は、固定
辞書やユーザ辞書を参照する際のカテゴリの優先順位を
示す情報を保持する優先順位レジスタである。２７は、
合成パラメータ生成処理の処理結果を一時保持するため
の内部バッファである。Reference numeral 25 is a category selection unit for selecting a category used for text analysis processing. Reference numeral 26 is a priority order register that holds information indicating the priority order of categories when referring to a fixed dictionary or a user dictionary. 27 is
This is an internal buffer for temporarily holding the processing result of the synthesis parameter generation processing.

【００２７】２８は、内部バッファ２７に保持されてい
る合成パラメータに基づいて、音声波形データを生成す
る音声合成部である。２９は、音声合成部２８で生成さ
れた音声波形データを、例えば、ディジタルアナログ変
換し、この変換出力に基づいて音声を出力したり、この
変換出力を他の装置に伝送したりする音声出力部であ
る。Reference numeral 28 is a voice synthesizing section for generating voice waveform data based on the synthesis parameters held in the internal buffer 27. A voice output unit 29 performs, for example, digital-analog conversion on the voice waveform data generated by the voice synthesis unit 28, outputs voice based on the converted output, and transmits the converted output to another device. Is.

【００２８】上記固定辞書記憶部２３は、単語を分類す
るための複数のカテゴリＡ，Ｂ，Ｃ，…のそれぞれに対
応する複数の単語記憶領域２３Ａ，２３Ｂ，２３Ｃ，…
を有する。各単語記憶領域２３Ａ，２３Ｂ，２３Ｃ，…
には、これに対応するカテゴリＡ，Ｂ，Ｃ，…に属する
単語が格納されている。The fixed dictionary storage unit 23 has a plurality of word storage areas 23A, 23B, 23C, ... Corresponding to a plurality of categories A, B, C ,.
Have. Each word storage area 23A, 23B, 23C, ...
Stores the words belonging to the corresponding categories A, B, C, ....

【００２９】上記ユーザ辞書記憶部２４も、複数のカテ
ゴリＡ，Ｂ，Ｃ，…のそれぞれに対応する複数の単語記
憶領域２４Ａ，２４Ｂ，２４Ｃ，…を有する。各単語記
憶領域２４Ａ，２４Ｂ，２４Ｃ…には、これに対応する
カテゴリに属する単語が格納されている。この場合、単
語記憶領域２４Ａ，２４Ｂ，２４Ｃ，…に対応するカテ
ゴリは、それぞれ固定辞書記憶部２３の単語記憶領域２
３Ａ，２３Ｂ，２３Ｃ，…に対応するカテゴリと一致す
る。The user dictionary storage section 24 also has a plurality of word storage areas 24A, 24B, 24C, ... Corresponding to a plurality of categories A, B, C ,. Each word storage area 24A, 24B, 24C ... Stores words belonging to the corresponding category. In this case, the categories corresponding to the word storage areas 24A, 24B, 24C, ... Are respectively word storage areas 2 of the fixed dictionary storage unit 23.
Matches the categories corresponding to 3A, 23B, 23C, ....

【００３０】なお、カテゴリＡ，Ｂ，Ｃ，…としては、
例えば、「地名」、「人名」、「会社名」等がある。ま
た、名詞のみを含むカテゴリに限らず、「医学」、「物
理学」「法学」等のように、動詞等を含むカテゴリもあ
る。The categories A, B, C, ...
For example, there are “place name”, “person name”, “company name” and the like. Further, it is not limited to the category including only nouns, and there are categories including verbs such as “medicine”, “physics”, and “law”.

【００３１】上記カテゴリ選択部２５は、上述した複数
のカテゴリＡ，Ｂ，Ｃ，…の中からテキスト解析処理に
使用するカテゴリを選択する。この選択は、テキストデ
ータ発生装置１０から送られてくるカテゴリ指定コマン
ドに基づいてなされる。The category selection unit 25 selects a category used for the text analysis process from the above-mentioned plurality of categories A, B, C, .... This selection is made based on the category designation command sent from the text data generator 10.

【００３２】テキストデータ発生装置１０の制御部１２
は、各テキストデータごとに、このテキストデータの解
析で使用するカテゴリを指定するためのカテゴリ指定コ
マンドをこのテキストデータと一緒に転送する。Control unit 12 of text data generator 10
Sends a category specifying command for specifying the category used in the analysis of this text data together with this text data for each text data.

【００３３】この場合、カテゴリ指定コマンドとテキス
トデータを区別する必要がある。この方法としては、例
えば、カテゴリ指定コマンドの位置情報を転送する方法
と、このコマンドのコード形態をテキストデータと区別
可能なコード形態にする方法が考えられる。この実施例
では、詳細な説明は後述するが、後者の方法を採用して
いる。In this case, it is necessary to distinguish between the category designation command and the text data. As this method, for example, a method of transferring the position information of the category designation command and a method of making the code form of the command distinguishable from the text data can be considered. In this embodiment, the latter method is adopted, which will be described in detail later.

【００３４】テキスト音声変換装置２０の制御部２２
は、受信データからコード形態の違いを利用してカテゴ
リ指定コマンドを抽出する。抽出されたカテゴリ指定コ
マンドは、カテゴリ選択部２５に供給される。Control unit 22 of text-to-speech converter 20
Extracts a category designation command from the received data by utilizing the difference in code form. The extracted category designation command is supplied to the category selection unit 25.

【００３５】上記優先順位レジスタ２６は、図２に示す
ように、各カテゴリＡ，Ｂ，…の先頭アドレスを保持す
る先頭アドレス保持領域２６１と、各カテゴリが選択さ
れているか否かを示すデータを保持する選択データ保持
領域２６２を有する。As shown in FIG. 2, the priority register 26 stores a head address holding area 261 for holding head addresses of the categories A, B, ... And data indicating whether or not each category is selected. It has a selection data holding area 262 for holding.

【００３６】この場合、各カテゴリＡ，Ｂ，Ｃ，…の先
頭アドレスは、優先順位順に保持されている。図には、
カテゴリＤの優先順位が最も高い場合を示す。選択デー
タは、カテゴリ指定コマンドに基づいて、制御部２２に
より書き込まれる。この場合、例えば、“１”は、選択
されていることを示し、“０”は、選択されていないこ
とを示す。図には、カテゴリＤ，Ｃが選択され、カテゴ
リＡ，Ｂが選択されていない場合を代表として示す。In this case, the head addresses of the categories A, B, C, ... Are held in order of priority. In the figure,
The case where the category D has the highest priority is shown. The selection data is written by the control unit 22 based on the category designation command. In this case, for example, "1" indicates that it is selected and "0" indicates that it is not selected. In the figure, the case where categories D and C are selected and categories A and B are not selected is shown as a representative.

【００３７】（１−３）第１の実施例の動作上記構成において動作を説明する。なお、以下の説明で
は、テキストデータが日本語テキストデータ、すなわ
ち、漢字かな交じり文である場合を代表として説明す
る。(1-3) Operation of the First Embodiment The operation of the above configuration will be described. In the following description, the case where the text data is Japanese text data, that is, a kanji-kana mixing sentence will be described as a representative.

【００３８】まず、テキストデータを音声波形デー
タに変換するためのテキスト音声変換処理を説明する。First, a text-to-speech conversion process for converting text data into voice waveform data will be described.

【００３９】ユーザが、テキストデータ発生装置１０
に、テキスト音声変換処理の実行を指示するための変換
命令を入力すると、この変換命令で指定される日本語テ
キストデータが制御部１２によってファイル記憶部１１
から読み出される。このテキストデータは、データ入出
力部１３を介して、テキスト音声変換装置２０に供給さ
れる。The user selects the text data generator 10
When a conversion command for instructing the execution of the text-to-speech conversion process is input, the Japanese text data specified by this conversion command is transferred to the file storage unit 11 by the control unit 12.
Is read from. This text data is supplied to the text-to-speech conversion device 20 via the data input / output unit 13.

【００４０】テキスト音声変換装置２０に供給されたテ
キストデータは、データ入出力部２１の内部に設けられ
たテキストバッファに一時保持された後、テキスト音声
変換処理によって、音声波形データに変換される。The text data supplied to the text-to-speech conversion device 20 is temporarily held in a text buffer provided inside the data input / output unit 21, and then converted into voice waveform data by a text-to-speech conversion process.

【００４１】図３は、テキスト音声変換処理の一例を示
す図である。図示のテキスト音声変換処理は、テキスト
解析処理と、合成パラメータ生成処理と、音声合成処理
に大別される。FIG. 3 is a diagram showing an example of the text-to-speech conversion processing. The illustrated text-to-speech conversion processing is roughly classified into text analysis processing, synthesis parameter generation processing, and speech synthesis processing.

【００４２】テキスト解析処理は、単語連接情報と、固
定辞書記憶部２３に格納されている固定辞書と、ユーザ
辞書記憶部２４に格納されているユーザ辞書を使って、
単語、文節の同定、読みの付与、文節アクセントの決
定、連濁、音便処理等の処理を実行することにより、音
韻韻律記号列（中間言語）を生成するようになってい
る。The text analysis process uses the word concatenation information, the fixed dictionary stored in the fixed dictionary storage unit 23, and the user dictionary stored in the user dictionary storage unit 24,
A phonological prosodic symbol string (intermediate language) is generated by executing processes such as word and phrase identification, reading addition, phrase accent determination, rendaku, and phonophoric processing.

【００４３】また、合成パラメータ生成処理は、テキス
ト解析処理によって得られた音韻韻律記号列から、韻律
規則付与処理によって、音声合成に必要なパラメータを
生成するようになっている。ここで、韻律規則付与処理
とは、合成単位の決定、ポーズの挿入、音韻継続時間の
制御、振幅の制御、ピッチの制御等を行う処理である。In the synthesis parameter generation process, parameters necessary for speech synthesis are generated from the phonological prosody symbol string obtained by the text analysis process by the prosody rule assignment process. Here, the prosody rule assigning process is a process of determining a synthesis unit, inserting a pause, controlling phoneme duration, controlling amplitude, controlling pitch, and the like.

【００４４】さらに、音声合成処理は、波形辞書を参照
しながら、波形合成処理によって、音声波形データを生
成するようになっている。Further, the voice synthesizing process generates the voice waveform data by the waveform synthesizing process while referring to the waveform dictionary.

【００４５】一般的なテキスト解析処理の処理項目とし
て、単語分割、単語の読み決定、アクセントの決定、品
詞情報の決定、単語結合、複合音処理、アクセント結
合、文節の読み設定、文節結合、アクセントの大きさの
決定、イントネーション位置と大きさの決定、ポーズ長
の決定等がある。As general text analysis processing items, word segmentation, word reading determination, accent determination, part-of-speech information determination, word combination, compound sound processing, accent combination, phrase reading setting, phrase combination, accent. The size of the pose, the intonation position and size, and the pose length.

【００４６】図４は、一般的なテキスト解析処理を示す
図である。図示のテキスト解析処理は、形態素解析処理
と、文節の生成処理と、句の生成処理とからなる。FIG. 4 is a diagram showing a general text analysis process. The illustrated text analysis process includes a morpheme analysis process, a clause generation process, and a phrase generation process.

【００４７】ここで、形態素解析処理とは、入力された
テキストデータと固定辞書およびユーザ辞書とを照合
し、このテキストデータを単語に分割する処理である。
図５は、固定辞書やユーザ辞書の一般的な登録内容を示
す図である。図示の如く、これらの辞書には、単語の表
記、読み、品詞、アクセント位置が登録されている。こ
れにより、入力テキストデータと辞書との照合により、
表記が一致すると、その読みと、品詞と、アクセント位
置が得られる。Here, the morphological analysis process is a process of matching the input text data with the fixed dictionary and the user dictionary and dividing the text data into words.
FIG. 5 is a diagram showing general registration contents of a fixed dictionary and a user dictionary. As shown in the figure, the notations, readings, parts of speech, and accent positions of words are registered in these dictionaries. As a result, by collating the input text data with the dictionary,
If the notations match, the reading, part of speech, and accent position are obtained.

【００４８】文節の生成処理とは、形態素解析処理で得
られた単語の読み、品詞、アクセント位置を示す情報に
基づいて、単語結合、複合音処理、アクセント結合を行
い、文節ごとの読みを決定する処理である。この決定に
は、アクセント結合規則と品詞情報による文節境界規則
が用いられる。The phrase generation processing includes word combination, compound sound processing, and accent combination based on the information indicating the word reading, the part of speech, and the accent position obtained by the morphological analysis processing, and determines the reading for each clause. It is a process to do. For this determination, an accent combination rule and a clause boundary rule based on part-of-speech information are used.

【００４９】句の生成処理とは、文節同士の結合規則に
より、句の読みを決定し、句内部での音律を決定する処
理である。図６に、品詞情報による結合規則の一例を示
す。The phrase generation process is a process of determining the reading of the phrase and determining the temperament within the phrase according to the connection rule between the phrases. FIG. 6 shows an example of a combination rule based on the part-of-speech information.

【００５０】なお、テキスト解析処理と合成パラメータ
生成処理は、制御部２２により実行される。これに対
し、音声合成処理は、音声合成部２８により実行され
る。この場合、制御部２２により生成された合成パラメ
ータは、内部バッファ２７に保持された後、音声合成部
２８に供給される。The text analysis process and the synthesis parameter generation process are executed by the control unit 22. On the other hand, the voice synthesizing process is executed by the voice synthesizing unit 28. In this case, the synthesis parameter generated by the control unit 22 is stored in the internal buffer 27 and then supplied to the voice synthesis unit 28.

【００５１】音声合成部２８により生成された音声波形
データは、音声出力部３０に供給される。この音声出力
部３０は、音声波形データをデジタル／アナログ変換
し、この変換出力によってスピーカを駆動して音声を出
力したり、変換出力を通信回線を介して他の装置に伝送
する。The voice waveform data generated by the voice synthesizer 28 is supplied to the voice output unit 30. The voice output unit 30 performs digital / analog conversion of voice waveform data, drives a speaker to output voice by the converted output, and transmits the converted output to another device via a communication line.

【００５２】次に、この発明の特徴する動作を説明
する。テキストデータ発生装置１０の制御部１２は、テ
キスト音声変換装置２０にテキストデータを転送する場
合、まず、上述したカテゴリ指定コマンドを転送し、次
に、テキストデータを転送する。Next, the characteristic operation of the present invention will be described. When transferring the text data to the text-to-speech conversion device 20, the control unit 12 of the text data generation device 10 first transfers the above-mentioned category designation command, and then transfers the text data.

【００５３】テキスト音声変換装置２０の制御部２２
は、受信データからカテゴリ指定コマンドを抽出し、カ
テゴリ選択部２５に供給する。また、制御部４２は、抽
出したカテゴリ指定コマンドに基づいて、優先順位レジ
スタ２６の選択データ保持領域２６２に選択データを書
き込む。これにより、このコマンドで指定されるカテゴ
リに対応する位置には、“１”が書き込まれる。Control unit 22 of text-to-speech converter 20
Extracts a category designation command from the received data and supplies it to the category selection unit 25. Further, the control unit 42 writes the selection data in the selection data holding area 262 of the priority order register 26 based on the extracted category designation command. As a result, "1" is written in the position corresponding to the category specified by this command.

【００５４】カテゴリ選択部２５は、カテゴリ指定コマ
ンドを受けると、このコマンドで指定されるカテゴリを
選択する。これにより、選択されたカテゴリに対応する
単語記憶領域がイネーブル状態に設定される。例えば、
カテゴリ指定コマンドにより、カテゴリＣ，Ｄが指定さ
れると、このカテゴリＣ，Ｄに対応する単語記憶領域２
３Ｃ，２３Ｄ，２４Ｃ，２４Ｄがイネーブル状態に設定
される。When the category selection unit 25 receives the category designation command, it selects the category designated by this command. As a result, the word storage area corresponding to the selected category is set to the enabled state. For example,
When the categories C and D are designated by the category designation command, the word storage area 2 corresponding to the categories C and D
3C, 23D, 24C and 24D are set to the enable state.

【００５５】カテゴリの選択処理が終了すると、制御部
２２は、選択されたカテゴリに対応する単語記憶領域に
格納されている単語を使って、受信したテキストデータ
を解析する。この場合、制御部２２は、例えば、最初
に、固定辞書を参照し、固定辞書に目的とする単語がな
かった場合は、ユーザ辞書を参照する。When the category selection processing is completed, the control unit 22 analyzes the received text data using the words stored in the word storage area corresponding to the selected category. In this case, for example, the control unit 22 first refers to the fixed dictionary, and when there is no target word in the fixed dictionary, the control unit 22 refers to the user dictionary.

【００５６】また、各辞書を参照する場合、制御部２２
は、選択されたカテゴリの優先順位に従って、カテゴリ
単位に参照する。すなわち、図２の例でいえば、制御部
２２は、まず、カテゴリＤに対応する単語記憶領域２３
Ｄ（あるいは２４Ｄ）を参照し、この単語記憶領域に目
的とする単語が格納されていなければ、カテゴリＣに対
応する単語記憶領域２３Ｃ（あるいは２４Ｃ）を参照す
る。When referring to each dictionary, the control unit 22
Refers to each category according to the priority of the selected category. That is, in the example of FIG. 2, the control unit 22 first sets the word storage area 23 corresponding to the category D.
D (or 24D) is referred to, and if the target word is not stored in this word storage area, the word storage area 23C (or 24C) corresponding to category C is referenced.

【００５７】この場合、優先順位レジスタ２６には、各
カテゴリＡ，Ｂ，Ｃ，Ｄ，…の先頭アドレスが格納され
ている。したがって、制御部２２は、選択されたカテゴ
リに対応する単語記憶領域の位置を知らなくても、これ
をアクセスすることができる。In this case, the priority register 26 stores the start addresses of the categories A, B, C, D, .... Accordingly, the control unit 22 can access the word storage area without knowing the position of the word storage area corresponding to the selected category.

【００５８】上記カテゴリ指定コマンドは、例えば、中
括弧で囲まれた２バイトを示す１６進コードと定義され
ている。この２バイトコードの各ビットは、それぞれ上
述した複数のカテゴリＡ，Ｂ，Ｃ，…の１つに対応す
る。これにより、２バイトコードでは、１６個のカテゴ
リＡ，Ｂ，Ｃ，…まで対応することができる。The category designation command is defined as, for example, a hexadecimal code indicating 2 bytes enclosed in braces. Each bit of the 2-byte code corresponds to one of the above-mentioned plurality of categories A, B, C ,. As a result, the 2-byte code can support up to 16 categories A, B, C, ....

【００５９】ここで、各カテゴリＡ，Ｂ，…は、対応す
るビットが“１”のとき、選択され、“０”のとき、選
択されないとすると、すべてのカテゴリＡ，Ｂ，…を選
択するコマンドコードは｛ＦＦＦＦ｝と表される。Here, each category A, B, ... Is selected when the corresponding bit is “1” and is not selected when the corresponding bit is “0”, all categories A, B ,. The command code is represented as {FFFF}.

【００６０】テキストデータ発生装置１０からテキスト
音声変換装置２０に、このカテゴリ指定コマンドが転送
されると、このカテゴリ指定コマンドは、データ入出力
テ部２１のテキストデータバッファに一時保持される。When this category designation command is transferred from the text data generator 10 to the text-to-speech converter 20, the category designation command is temporarily held in the text data buffer of the data input / output unit 21.

【００６１】テキスト音声変換装置２０の制御部２２
は、テキストデータバッファから２バイトづつデータを
読み出し、これが“｛”か否かを判定する。“｛”であ
れば、以下のデータがカテゴリ指定コマンドかテキスト
データかの２通りの解釈ができる。Control unit 22 of text-to-speech converter 20
Reads out the data from the text data buffer every 2 bytes and judges whether or not this is "{". If “{”, the following data can be interpreted in two ways, that is, a category designation command or text data.

【００６２】このため、続く８バイト（“Ｆ”は２バイ
トの文字コードとする。）を読む。これが、“ＦＦＦ
Ｆ”であれば、これもコマンドとして有効である。この
ため、この場合は、続く２バイトを読み、これが“｝”
であれば、“｛ＦＦＦＦ｝”のカテゴリ指定コマンドと
判定し、“｝”でなければテキストデータと判定する。
このようにして、制御部２２は、受信データからカテゴ
リ指定コマンドを抽出する。Therefore, the following 8 bytes ("F" is a 2-byte character code) are read. This is "FFF
If it is F ", this is also valid as a command. Therefore, in this case, the following 2 bytes are read and this is"} ".
If it is, it is determined to be a category designation command of "{FFFF}", and if it is not "}", it is determined to be text data.
In this way, the control unit 22 extracts the category designation command from the received data.

【００６３】図７は、上述したカテゴリ選択部２５の具
体的構成の一例を示すブロック図である。FIG. 7 is a block diagram showing an example of a concrete configuration of the category selecting section 25 described above.

【００６４】図示のカテゴリ選択部２５は、ラッチ回路
２５１により構成されている。このラッチ回路２５１の
入力端子には、制御部２２によってデータ入出力部２１
のテキストデータバッファから並列に読み出された２バ
イトのデータが供給される。The illustrated category selecting section 25 is composed of a latch circuit 251. The data input / output unit 21 is connected to the input terminal of the latch circuit 251 by the control unit 22.
2 bytes of data read in parallel from the text data buffer are supplied.

【００６５】また、このラッチ回路２５１の出力端子
は、図８に示すように、各ビットごとに、対応する単語
記憶領域２３Ａ，２４Ａ、２３Ｂ，２４Ｂ，２３Ｃ，２
４Ｃ，…のチップセレクト端子ＣＥに接続されている。
例えば、ラッチ回路２５１の１ビット目の出力端子は、
単語記憶領域２３Ａ，２４Ａのチップセレクト端子ＣＥ
に接続され、２ビット目の出力端子は、単語記憶領域２
３Ｂ，２４Ｂのチップセレクト端子ＣＥに接続され、３
ビット目の出力端子は、単語記憶領域２３Ｃ，２４Ｃの
チップセレクト端子ＣＥに接続されている。Further, as shown in FIG. 8, the output terminal of the latch circuit 251 has a corresponding word storage area 23A, 24A, 23B, 24B, 23C, 2 for each bit.
4C, ... Chip select terminals CE are connected.
For example, the output terminal of the first bit of the latch circuit 251 is
Chip select terminals CE of the word storage areas 23A and 24A
The second bit output terminal is connected to the word storage area 2
Connected to the chip select terminals CE of 3B and 24B,
The output terminal of the bit is connected to the chip select terminal CE of the word storage areas 23C and 24C.

【００６６】上記構成においては、ラッチ回路２５１
は、テキスト音声変換処理の開始に先立って、制御部２
２から出力されるリセット信号によりリセットされる。
これにより、ラッチ回路２５１の出力は、例えば、“Ｆ
ＦＦＦ”に設定される。その結果、この場合は、すべて
の単語記憶領域２３Ａ，２４Ａ、２３Ｂ，２４Ｂ，２３
Ｃ，２４Ｃ，…がイネーブル状態に設定される。In the above configuration, the latch circuit 251
Prior to the start of the text-to-speech conversion process, the control unit 2
It is reset by the reset signal output from 2.
As a result, the output of the latch circuit 251 is, for example, “F
FFF ". As a result, in this case, all word storage areas 23A, 24A, 23B, 24B, 23
C, 24C, ... Are set to the enabled state.

【００６７】この状態で、制御部２２によって、カテゴ
リ指定コマンドが抽出されると、制御部２２からラッチ
信号が出力される。これにより、ラッチ回路２５１にカ
テゴリ指定コマンドがラッチされる。その結果、今度
は、カテゴリ指定コマンドで指定されるカテゴリに対応
する単語記憶領域がイネーブル状態に設定される。In this state, when the category designation command is extracted by the control unit 22, the control unit 22 outputs a latch signal. As a result, the category designation command is latched in the latch circuit 251. As a result, this time, the word storage area corresponding to the category designated by the category designation command is set to the enabled state.

【００６８】（１−４）第１の実施例の効果以上詳述したこの実施例によれば、次のような効果が得
られる。(1-4) Effects of the First Embodiment According to this embodiment described in detail above, the following effects can be obtained.

【００６９】まず、この実施例によれば、単語を単
語辞書に登録する際、あるカテゴリに従って分類して登
録し、テキストデータを解析する際、複数のカテゴリの
中から使用するカテゴリを選択し、選択されたカテゴリ
に属する単語に基づいて、解析するようにしたので、固
定辞書とユーザ辞書のうち、解析に必要な部分だけを使
って、テキストデータを解析することができる。First, according to this embodiment, when words are registered in the word dictionary, they are classified and registered according to a certain category, and when analyzing text data, a category to be used is selected from a plurality of categories, Since the analysis is performed based on the words belonging to the selected category, it is possible to analyze the text data using only the part necessary for the analysis of the fixed dictionary and the user dictionary.

【００７０】これにより、辞書を広く、深く編集したと
しても、テキスト解析処理の処理速度が低下することを
極力防止することができる。その結果、この実施例によ
れば、単語辞書の拡充とテキスト解析処理の処理速度の
低下防止という相反する２つの要求を満足することがで
きる。As a result, even if the dictionary is wide and deeply edited, it is possible to prevent the processing speed of the text analysis processing from decreasing as much as possible. As a result, according to this embodiment, it is possible to satisfy two conflicting requirements of expanding the word dictionary and preventing the processing speed of the text analysis processing from decreasing.

【００７１】また、この実施例によれば、テキスト
データ発生装置１０側で、カテゴリを指定するようにし
たので、１つのテキストデータ発生装置１０に電話回線
や電波を通じて複数のテキスト音声変換装置２０を接続
し、同じテキストデータを同報通信するシステム、すな
わち、送信者主体のシステムにおいて、すべてのテキス
ト音声変換装置２０で最適なカテゴリを選択することが
できる。Further, according to this embodiment, since the category is designated on the side of the text data generator 10, a plurality of text-to-speech converters 20 are provided to one text data generator 10 through a telephone line or radio waves. In a system that connects and broadcasts the same text data, that is, a sender-centered system, it is possible to select the optimum category for all text-to-speech conversion devices 20.

【００７２】すなわち、上述したようなシステムにおい
ては、受信者側で使用するカテゴリを指定することが考
えられる。しかしながら、このような構成では、各受信
者が必ずしもテキストデータの内容を知っているとは限
らないので、受信したテキストデータに最も適したなカ
テゴリを指定することができるとは限らない。That is, in the system as described above, it is possible to specify the category to be used on the receiver side. However, in such a configuration, each recipient does not always know the content of the text data, and thus it is not always possible to specify the most suitable category for the received text data.

【００７３】これに対し、この実施例のように、送信者
側でカテゴリを指定する構成によれば、テキストデータ
の内容をよく知っている者がカテゴリを指定することが
できるので、すべての受信者が受信したテキストデータ
に最も適したカテゴリを選択することができる。On the other hand, according to the configuration in which the sender specifies the category as in this embodiment, a person who is familiar with the contents of the text data can specify the category. The person can select the category most suitable for the received text data.

【００７４】さらに、この実施例によれば、テキス
トデータを解析する際、予め定めた優先順位に従って、
選択されたカテゴリに属する単語を参照するようにした
ので、テキスト解析処理の処理速度を高めることができ
る。Furthermore, according to this embodiment, when analyzing text data, according to a predetermined priority order,
Since the words belonging to the selected category are referred to, the processing speed of the text analysis processing can be increased.

【００７５】（２）第２の実施例次に、この発明の第２の実施例を詳細に説明する。(2) Second Embodiment Next, a second embodiment of the present invention will be described in detail.

【００７６】（２−１）第２の実施例の概要先の実施例では、テキストデータ発生装置からテキ
スト音声変換装置に、テキストデータと一緒にカテゴリ
指定コマンドを転送し、テキスト音声変換装置におい
て、受信データからカテゴリ指定コマンドを抽出し、抽
出されたカテゴリ指定コマンドに基づいて、使用するカ
テゴリを選択する場合を説明した。(2-1) Outline of Second Embodiment In the previous embodiment, the text data generating device transfers the category designation command together with the text data to the text-to-speech conversion device. The case where the category designation command is extracted from the received data and the category to be used is selected based on the extracted category designation command has been described.

【００７７】これに対し、この実施例は、テキスト音声
変換装置に、使用するカテゴリを指定するためのカテゴ
リ指定スイッチを設け、このスイッチによって指定され
たカテゴリを選択するようにしたものである。On the other hand, in this embodiment, the text-to-speech converter is provided with a category designating switch for designating the category to be used, and the category designated by this switch is selected.

【００７８】また、先の実施例では、選択されたカ
テゴリを予め定めた優先順位に従って参照することによ
り、テキストデータを解析する場合を説明した。Further, in the above embodiment, the case where the text data is analyzed by referring to the selected category in accordance with the predetermined priority order has been described.

【００７９】これに対し、この実施例は、目的とする単
語を検索するための検索テーブルを作成し、この検索テ
ーブルを使って選択されたカテゴリを参照することによ
り、テキストデータを解析するようにしたものである。On the other hand, in this embodiment, a search table for searching a target word is created, and the text data is analyzed by referring to the category selected using this search table. It was done.

【００８０】（２−１）第２の実施例の構成図９は、この発明の第２の実施例の構成を示すブロック
図である。なお、図９には、この発明のテキスト解析装
置を備えたテキスト音声変換システム全体の構成を示
す。(2-1) Configuration of the Second Embodiment FIG. 9 is a block diagram showing the configuration of the second embodiment of the present invention. Note that FIG. 9 shows the overall configuration of the text-to-speech conversion system equipped with the text analysis device of the present invention.

【００８１】図において、４０は、テキストデータ発生
装置であり、５０は、テキスト音声変換装置である。テ
キストデータ発生装置４０において、４１は、ファイル
記憶部であり、４２は、制御部であり、４３は、データ
入出力部である。In the figure, 40 is a text data generator, and 50 is a text-to-speech converter. In the text data generating device 40, 41 is a file storage unit, 42 is a control unit, and 43 is a data input / output unit.

【００８２】テキスト音声変換御装置５０において、５
１は、データ入出力部であり、５２は、制御部であり、
５３は、固定辞書記憶部であり、５４は、ユーザ辞書記
憶部であり、５５は、カテゴリ選択部であり、５６は、
検索テーブル記憶部であり、５７は、内部バッファであ
り、５８は、音声合成部であり、５９は、音声出力部で
あり、６０は、カテゴリ指定スイッチである。In the text-to-speech conversion device 50, 5
1 is a data input / output unit, 52 is a control unit,
Reference numeral 53 is a fixed dictionary storage unit, 54 is a user dictionary storage unit, 55 is a category selection unit, and 56 is
A search table storage unit, 57 is an internal buffer, 58 is a voice synthesis unit, 59 is a voice output unit, and 60 is a category designation switch.

【００８３】上記固定辞書記憶部５３は、複数のカテゴ
リＡ，Ｂ，Ｃ，…のそれぞれに対応する複数の単語記憶
領域５３Ａ，５３Ｂ，５３Ｃ，…を有する。上記ユーザ
辞書記憶部５４も、複数のカテゴリＡ，Ｂ，Ｃ，…のそ
れぞれに対応する複数の単語記憶領域５４Ａ，５４Ｂ，
５４Ｃ，…を有する。The fixed dictionary storage section 53 has a plurality of word storage areas 53A, 53B, 53C, ... Corresponding to a plurality of categories A, B, C ,. The user dictionary storage unit 54 also has a plurality of word storage areas 54A, 54B, corresponding to the plurality of categories A, B, C ,.
54C, ...

【００８４】ここで、この実施例の特徴とする構成の主
な点を説明すると、例えば、次のようになる。The main points of the characteristic structure of this embodiment will be described below, for example.

【００８５】この実施例では、使用するカテゴリは、カ
テゴリ指定スイッチ６０を使って、ユーザにより指定さ
れる。したがって、この実施例では、テキストデータ発
生装置４０の制御部４２は、カテゴリ指定コマンドを転
送しない。In this embodiment, the category to be used is designated by the user using the category designation switch 60. Therefore, in this embodiment, the control unit 42 of the text data generator 40 does not transfer the category designation command.

【００８６】また、この実施例では、テキストデータを
解析する場合、検索テーブル記憶部６０に格納されてい
る検索テーブルを使って、目的とする単語を検索する。
この検索テーブルは、カテゴリ指定スイッチ６０によ
り、使用するカテゴリが指定されると、この指定カテゴ
リと固定辞書およびユーザ辞書の記憶内容に基づいて、
制御部５２によって作成される。Further, in this embodiment, when analyzing text data, the search table stored in the search table storage section 60 is used to search for the target word.
When the category to be used is designated by the category designation switch 60, this search table is based on the designated category and the stored contents of the fixed dictionary and the user dictionary.
It is created by the control unit 52.

【００８７】（２−３）第２の実施例の動作上記構成において動作を説明する。なお、以下の説明で
は、この実施例の特徴とする動作、すなわち、カテゴリ
選択動作と、検索テーブル作成動作と、検索テーブルを
使ったテキスト解析処理を中心に説明する。(2-3) Operation of the Second Embodiment The operation of the above configuration will be described. It should be noted that the following description focuses on the characteristic operations of this embodiment, that is, the category selecting operation, the search table creating operation, and the text analysis process using the search table.

【００８８】まず、カテゴリ選択動作を説明する。
ユーザが、カテゴリ指定スイッチ６０を使って、使用す
るカテゴリを指定すると、カテゴリ指定スイッチ６０か
らカテゴリ選択部５５にカテゴリ指定データが供給され
る。カテゴリ選択部５５は、このカテゴリ指定データを
受けると、このデータにより指定されるカテゴリを選択
する。First, the category selection operation will be described.
When the user uses the category designating switch 60 to designate the category to be used, the category designating data is supplied from the category designating switch 60 to the category selecting section 55. Upon receiving this category designation data, the category selection unit 55 selects the category designated by this data.

【００８９】図１０は、カテゴリ指定スイッチ６０の具
体的構成の一例を示す図である。図示のカテゴリ指定ス
イッチ６０は、例えば、１６ビットのディップスイッチ
（ＤＩＰスイッチ）で構成されている。これにより、１
６個のカテゴリＡ，Ｂ，…，Ｐに対処可能となってい
る。このカテゴリ指定スイッチ６０の各ビットのスイッ
チ６０Ａ，６０Ｂ，…，６０Ｐの出力は、カテゴリ選択
部５５に供給されている。FIG. 10 is a diagram showing an example of a specific configuration of the category designation switch 60. The illustrated category designating switch 60 is configured by, for example, a 16-bit DIP switch (DIP switch). This gives 1
Six categories A, B, ..., P can be dealt with. The output of each bit switch 60A, 60B, ..., 60P of the category designating switch 60 is supplied to the category selecting section 55.

【００９０】このカテゴリ選択部５５は、例えば、先の
実施例と同様に、ラッチ回路５５１により構成されてい
る。このラッチ回路５５１は、テキスト解析処理の開始
時に、制御部５２から与えられるリセット信号により一
旦リセットされた後、同じく、制御部５２から与えられ
るラッチ信号に従って、入力データをラッチする。これ
により、カテゴリ指定スイッチ６０から出力されるカテ
ゴリ指定データがラッチ回路５６１にラッチされる。The category selecting section 55 is composed of a latch circuit 551, for example, as in the previous embodiment. The latch circuit 551 is temporarily reset by a reset signal supplied from the control unit 52 at the start of the text analysis process, and then similarly latches the input data according to the latch signal supplied from the control unit 52. As a result, the category specifying data output from the category specifying switch 60 is latched in the latch circuit 561.

【００９１】次に、検索テーブル作成動作を説明す
る。制御部５２は、カテゴリ指定スイッチ６０によっ
て、使用するカテゴリが指定されるたびに、指定された
カテゴリに属する単語を検索するための検索テーブルを
生成する。Next, the search table creating operation will be described. Each time the category designation switch 60 designates a category to be used, the control unit 52 creates a search table for searching a word belonging to the designated category.

【００９２】この検索テーブルは、単語の格納アドレス
が登録される検索テーブルと、ある文字で始まる単語の
格納アドレスが格納されているリンクの先頭アドレスが
登録される検索テーブルとからなる。以下、前者の検索
テーブルを２次検索テーブルといい、後者の検索テーブ
ルを１次検索テーブルという。This search table is composed of a search table in which the storage address of a word is registered and a search table in which the top address of a link in which the storage address of a word starting with a certain character is stored is registered. Hereinafter, the former search table is referred to as a secondary search table, and the latter search table is referred to as a primary search table.

【００９３】図１１は、固定辞書記憶部５３に格納され
ている固定辞書とユーザ辞書記憶部５４に格納されてい
るユーザ辞書の登録内容の一例を示す図である。なお、
図には、説明を簡単にするために、固定辞書とユーザ辞
書の区別をなくし、１つの単語辞書の登録内容を示す。
図１２は、２次検索テーブルの登録内容の一例を示す図
である。図１３は、１次検索テーブルの登録内容の一例
を示す図である。FIG. 11 is a diagram showing an example of registered contents of the fixed dictionary stored in the fixed dictionary storage unit 53 and the user dictionary stored in the user dictionary storage unit 54. In addition,
In the figure, for the sake of simplicity, the distinction between the fixed dictionary and the user dictionary is eliminated, and the registered contents of one word dictionary are shown.
FIG. 12 is a diagram showing an example of registered contents of the secondary search table. FIG. 13 is a diagram showing an example of registered contents of the primary search table.

【００９４】図１１に示す辞書では、１つのカテゴリに
つき、１００００ｈ（ｈは１６進を示す）のアドレスが
与えられている。各アドレスには、辞書照合に用いられ
る表記、読み、その他の情報が収められる。また、図示
の例では、カテゴリＡが化学用語を示し、カテゴリＢが
音楽用語を示し、カテゴリＣが医学用語を示し、カテゴ
リＤが生物用語を示す場合を代表として示す。In the dictionary shown in FIG. 11, an address of 10000h (h indicates hexadecimal) is given for each category. Each address contains notation, reading, and other information used for dictionary matching. Further, in the illustrated example, the case where category A shows a chemical term, category B shows a music term, category C shows a medical term, and category D shows a biological term is shown as a representative.

【００９５】制御部５２は、検索テーブルを作成する場
合、まず、図１１に示す辞書の先頭から順に、例えば、
「ア」の文字コードで始まる単語の格納アドレスを抽出
する。この抽出は、指定されたカテゴリに属する単語に
ついてだけ行われる。When creating the search table, the control unit 52 first sorts, for example, from the beginning of the dictionary shown in FIG.
The storage address of a word starting with the character code of "A" is extracted. This extraction is performed only for words belonging to the specified category.

【００９６】今、カテゴリＡ，Ｂ，Ｃ，Ｄがすべて指定
されているとすると、「アルカロイド」の格納アドレス
１００００ｈ、「アンチモン」の格納アドレス１０００
１ｈ、「アンダンテ」の格納アドレス２００００ｈ、
「アンプール」の格納アドレス３００００ｈ、「アンコ
ラ」の格納アドレス４０００１ｈが抽出される。If all categories A, B, C and D are designated, the storage address of "alkaloid" is 10000h and the storage address of "antimony" is 1000h.
1h, storage address of "Andante" 20000h,
The storage address 30000h of "Ampool" and the storage address 40001h of "Ancora" are extracted.

【００９７】次に、抽出したすべての格納アドレス１０
０００ｈ，１０００１ｈ，２００００ｈ，３００００
ｈ，４０００１ｈを、対応する単語の表記の長いものか
ら順に並べ替える。今の例の場合、表記は、「アルカロ
イド」、「アンダンテ」、「アンプール」、「アンチモ
ン」、「アンコウ」の順に長い。したがって、抽出され
た格納アドレスは、１００００ｈ，２００００ｈ，３０
０００ｈ，１０００１ｈ，４０００１ｈの順に並べ替え
られる。Next, all the extracted storage addresses 10
000h, 10001h, 20000h, 30,000
h, 40001h are sorted in order from the longest corresponding word. In the case of the present example, the notation is long in the order of "alkaloid", "andante", "ampur", "antimony", and "angler". Therefore, the extracted storage addresses are 10000h, 20000h, 30
They are rearranged in the order of 000h, 10001h, 40001h.

【００９８】なお、「アンダンテ」と「アンプール」の
表記の長さは、同じ６である。この場合は、例えば、カ
テゴリが先にあるもの、すなわち、「アンダンテ」を先
に並べる。The lengths of the expressions "Andante" and "Ampool" are the same 6. In this case, for example, the items having the category first, that is, "Andante" are arranged first.

【００９９】この並替えが終了すると、各格納アドレス
１００００ｈ，２００００ｈ，３００００ｈ，１０００
１ｈ，４０００１ｈを、アドレス１０００ｈから順に書
き込む。この書込みが終了すると、「ア」で始まる単語
の格納アドレスのリンクの最終を示すためのコード、例
えば“ＦＦＦＦＦＦＦＦｈ”を、次のアドレス１ｆｆｆ
ｈに格納する。When this rearrangement is completed, each storage address 10000h, 20000h, 30000h, 1000
1h and 40001h are written in order from the address 1000h. When this writing is completed, a code for indicating the end of the link of the storage address of the word starting with "A", for example, "FFFFFFFFh", is added to the next address 1fff.
Store in h.

【０１００】次に、「イ」についても同様の処理を行
う。この場合、格納アドレスの書込みは、アドレス１ｆ
ｆｆｈの次のアドレス２０００ｈから開始される。以
下、「ウ」，「エ」，…についても、同様の処理を行
う。この処理が、テキスト解析処理の対象となるすべて
の文字について終了すると、２次検索テーブルの作成が
終了する。Next, similar processing is performed for "a". In this case, the storage address is written at the address 1f.
It starts from the next address 2000h after ffh. Hereinafter, the same processing is performed for “C”, “D”, .... When this process ends for all the characters that are the target of the text analysis process, the creation of the secondary search table ends.

【０１０１】２次検索テーブルの作成が終了すると、
「ア」で始まるリンクの先頭アドレスと「ア」の文字コ
ードを、アドレス０１００ｈに格納する。以下、同様
に、「イ」，「ウ」，「エ」，…についても、同様の処
理を行う。この処理が、テキスト解析処理の対象となる
すべての文字について終了すると、１次検索テーブルの
作成が終了する。When the creation of the secondary search table is completed,
The start address of the link starting with "a" and the character code of "a" are stored in address 0100h. Hereinafter, similarly, the same processing is performed for “a”, “c”, “d”, .... When this process ends for all the characters that are the target of the text analysis process, the creation of the primary search table ends.

【０１０２】次に、検索テーブルを使ったテキスト
解析処理を説明する。Next, the text analysis process using the search table will be described.

【０１０３】今、送られたきたテキストデータが「アン
コウは…」であるものとする。この場合、制御部５２
は、その先頭の文字「ア」に着目し、２次検索テーブル
で「ア」を検索することにより、「ア」のリンクが存在
するアドレスを求める。Now, it is assumed that the sent text data is "Anchou ...". In this case, the control unit 52
Pays attention to the leading character "A" and searches for "A" in the secondary search table to obtain the address where the link "A" exists.

【０１０４】図１３の場合、アドレス０１００ｈに
「ア」が格納され、この「ア」に続くアドレスデータが
１０００ｈであることから、「ア」のリンクがアドレス
１０００ｈにあることことがわかる。In the case of FIG. 13, "A" is stored in the address 0100h, and the address data following this "A" is 1000h, which indicates that the link of "A" is at the address 1000h.

【０１０５】次に、制御部５２は、アドレス１０００ｈ
の内容を読み出し、これによって示されるアドレスの内
容とテキストデータを比較し、両者が一致するか否かを
判定する。今の例の場合、１０００ｈには、１００００
ｈが格納されているため、アドレス１００００ｈの内容
が読み出し、これとテキストデータを比較し、両者が一
致するか否かを判定する。今の例の場合、アドレス１０
０００ｈには、「アルカロイド」が格納されているた
め、一致しないと判定される。Next, the controller 52 sends the address 1000h.
Is read out, the contents of the address indicated by this are compared with the text data, and it is determined whether the two match. In the case of the present example, 1000h is 10,000
Since h is stored, the content of the address 10000h is read, and this is compared with the text data to determine whether the two match. In the case of this example, address 10
Since "alkaloid" is stored in 000h, it is determined that they do not match.

【０１０６】次に、制御部５２は、アドレス１００１ｈ
の内容を読み出し、これによって示されるアドレスの内
容とテキストデータを比較し、両者が一致するか否かを
判定する。今の例の場合、アドレス１００１ｈには、２
００００ｈが格納され、アドレス２００００ｈは、「ア
ンダンテ」が格納されているため、一致しないと判定さ
れる。Next, the controller 52 sends the address 1001h.
Is read out, the contents of the address indicated by this are compared with the text data, and it is determined whether the two match. In the case of the present example, the address 1001h has 2
Since "0000h" is stored and "Andante" is stored in the address 20000h, it is determined that they do not match.

【０１０７】以下、同様に、２次検索テーブルのアドレ
スを順次更新しながら、上述した処理が繰り返される。
これにより、最終的に４０００１ｈ番地に格納されてい
る「アンコウ」にたどりつく。Similarly, the above-mentioned processing is repeated while sequentially updating the addresses of the secondary search table.
As a result, the user finally reaches the “angler” stored at the address 40001h.

【０１０８】このような構成においては、図１１に示す
カテゴリＡ，Ｂ，Ｃが指定されなければ、図１２に示す
検索テーブルが小さくなる。その結果、「アンコウ」に
たどりつくステップ数が減少し、高速な検索が可能にな
る。In such a structure, if categories A, B, and C shown in FIG. 11 are not designated, the search table shown in FIG. 12 becomes smaller. As a result, the number of steps to reach "angler" is reduced, and high-speed search is possible.

【０１０９】（２−４）第２の実施例の効果以上詳述したこの実施例においても、先の第１の実施例
のの効果とほぼ同じ効果を得ることができるととも
に、さらに、次のような効果を得ることができる。(2-4) Effects of the Second Embodiment Also in this embodiment described in detail above, it is possible to obtain almost the same effects as those of the first embodiment described above, and further, the following effects. Such an effect can be obtained.

【０１１０】まず、この実施例によれば、テキスト
音声変換装置５０側でカテゴリを指定するようにしたの
で、テキストデータ発生装置４０が複数存在し、テキス
ト音声変換装置５０が複数のテキストデータ発生装置４
０から送られてくる複数のテキストデータの中から希望
するテキストデータを選択するシステム、すなわち、受
信者主体のシステムにおいて、通信コストの上昇を招く
ことなく、各テキストデータごとに良好なカテゴリを選
択することができる。First, according to this embodiment, the text-to-speech conversion device 50 side specifies the category. Therefore, there are a plurality of text data generation devices 40, and the text-speech conversion device 50 has a plurality of text data generation devices. Four
In a system that selects desired text data from a plurality of text data sent from 0, that is, in a receiver-centered system, a good category is selected for each text data without increasing communication cost. can do.

【０１１１】すなわち、この場合も、先の第１の実施例
のように、送信者側でカテゴリを指定することが考えら
れる。しかしながら、このシステムにおいては、送信者
がカテゴリ指定コマンドを送出するタイミングと受信者
がテキストデータを選択するタイミング、言い換えれ
ば、送信局を選択するタイミングにはずれが生じる。That is, also in this case, it is possible to specify the category on the sender side as in the first embodiment. However, in this system, there is a gap between the timing at which the sender sends the category designation command and the timing at which the receiver selects the text data, in other words, the timing at which the transmission station is selected.

【０１１２】このずれを小さくするには、多くのカテゴ
リ指定コマンドを送信テキストデータ中に埋め込む必要
がある。しかしながら、このようにすると、送信データ
が大きくなり、通信コストが増大する。To reduce this deviation, it is necessary to embed many category designation commands in the transmitted text data. However, in this case, the transmission data becomes large and the communication cost increases.

【０１１３】これに対し、この実施例では、受信者側で
カテゴリを指定するようにしたので、上述したようなタ
イミング調整が不要となる。これにより、送信データの
増大に起因する通信コストの上昇を招くことなく、各テ
キストデータごとに、良好なカテゴリを選択することが
できる。On the other hand, in this embodiment, since the category is specified on the receiver side, the timing adjustment as described above becomes unnecessary. As a result, a good category can be selected for each text data without incurring an increase in communication cost due to an increase in transmission data.

【０１１４】さらに、この実施例によれば、指定カ
テゴリが変更されるたびに、目的とする単語を検索する
ための検索テーブルを作成し、この検索テーブルを使っ
てテキスト解析処理を実行するようにしたので、解析処
理の処理速度を高めることができる。Further, according to this embodiment, every time the designated category is changed, a search table for searching the target word is created, and the text analysis process is executed using this search table. Therefore, the processing speed of the analysis processing can be increased.

【０１１５】（３）そのほかの実施例以上、この発明の２つの実施例を詳細に説明したが、こ
の発明は、上述したような実施例に限定されるものでは
ない。(3) Other Embodiments Two embodiments of the present invention have been described in detail above, but the present invention is not limited to the above-mentioned embodiments.

【０１１６】例えば、先の第１の実施例では、カテ
ゴリの優先順位が固定である場合を説明した。しかしな
がら、この発明は、例えば、各カテゴリの検索頻度を監
視し、いくつかのテキストファイルの解析処理が終了し
た段階で、検索頻度の監視結果に基づいて、優先順位を
自動的に変更するようにしてもよい。For example, in the first embodiment described above, the case where the priority order of the categories is fixed has been described. However, according to the present invention, for example, the search frequency of each category is monitored, and when the analysis processing of some text files is completed, the priority order is automatically changed based on the monitoring result of the search frequency. May be.

【０１１７】また、先の第２の実施例では、抽出さ
れた複数の格納アドレスを、対応する文字コードの表記
が長いものから順次並べ替えて登録する場合を説明し
た。しかしながら、この発明は、例えば、各カテゴリに
優先順位を付け、カテゴリの異なるものについては、そ
の優先順位に従って並べ替え、カテゴリの同じものにつ
いては、例えば、先の実施例と同様に、表記の長さに従
って並べ替えるようにしてもよい。Further, in the second embodiment described above, a case has been described in which the plurality of extracted storage addresses are sequentially rearranged and registered from the one having the longest corresponding character code description. However, according to the present invention, for example, priorities are assigned to the respective categories, different categories are sorted according to their priorities, and the same categories are sorted according to, for example, the notation length. You may make it rearrange according to.

【０１１８】また、先の実施例では、この発明をテ
キスト音声変換システムのテキスト解析装置に適用する
場合を説明した。しかしながら、この発明は、テキスト
音声変換システム以外のテキスト解析装置にも適用する
ことができる。Further, in the above embodiment, the case where the present invention is applied to the text analysis device of the text-to-speech conversion system has been described. However, the present invention can be applied to a text analysis device other than the text-to-speech conversion system.

【０１１９】また、先の実施例では、この発明を、
テキストデータから単語辞書を使って音韻韻律記号列を
生成するテキスト解析装置に適用する場合を説明した。
しかしながら、この発明は、少なくとも、単語分割を行
うテキスト解析装置であれば、音韻韻律記号列を生成し
ないテキスト解析装置にも適用することができる。In addition, in the previous embodiment, the present invention is
The case has been described where the present invention is applied to a text analysis device that generates a phonological prosodic symbol string from text data using a word dictionary.
However, the present invention can be applied to at least a text analysis device that does not generate a phonological prosodic symbol string as long as the text analysis device performs word division.

【０１２０】このほかにも、この発明は、その要旨
を逸脱しない範囲で、種々様々変形実施可能なことは勿
論である。In addition to this, it goes without saying that the present invention can be variously modified and implemented without departing from the scope of the invention.

【０１２１】[0121]

【発明の効果】以上詳述したようにこの発明によれば、
単語を辞書に登録する際、複数のカテゴリに従って分類
して登録し、テキストデータを解析する際、使用するカ
テゴリを選択し、選択されたカテゴリに属する単語を使
って解析するようにしたので、単語辞書のうち、解析す
るテキストデータに本当に必要な部分だけを使って、テ
キストデータを解析することができる。これにより、単
語辞書の拡充とテキスト解析処理の処理速度の低下防止
という相反する２つの要求を満足することができる。As described above in detail, according to the present invention,
When registering words in the dictionary, they are classified and registered according to multiple categories, and when analyzing text data, the category to be used is selected and the words belonging to the selected category are used for analysis. You can parse text data using only the part of the dictionary that is really needed for the text data to be parsed. This makes it possible to satisfy two conflicting requirements of expanding the word dictionary and preventing the processing speed of text analysis processing from decreasing.

[Brief description of drawings]

【図１】この発明の第１の実施例の構成を示すブロック
図である。FIG. 1 is a block diagram showing a configuration of a first embodiment of the present invention.

【図２】第１の実施例の優先順位レジスタの構成を示す
図である。FIG. 2 is a diagram showing a configuration of a priority register according to the first embodiment.

【図３】第１の実施例のテキスト音声変換処理の一例を
示す図である。FIG. 3 is a diagram illustrating an example of text-to-speech conversion processing according to the first embodiment.

【図４】第１の実施例のテキスト解析処理の一例を示す
図である。FIG. 4 is a diagram illustrating an example of text analysis processing according to the first embodiment.

【図５】第１の実施例の単語辞書の登録内容の一例を示
す図である。FIG. 5 is a diagram showing an example of registered contents of a word dictionary according to the first embodiment.

【図６】第１の実施例の品詞による結合規則の一例を示
す図である。FIG. 6 is a diagram showing an example of a part-of-speech combination rule according to the first embodiment.

【図７】第１の実施例のカテゴリ選択部の具体的構成の
一例を示す図をである。FIG. 7 is a diagram showing an example of a specific configuration of a category selection unit of the first exemplary embodiment.

【図８】第１の実施例のカテゴリ選択部と辞書記憶部と
の関係を示すブロック図である。FIG. 8 is a block diagram showing a relationship between a category selection unit and a dictionary storage unit according to the first embodiment.

【図９】この発明の第２の実施例の構成を示すブロック
図である。FIG. 9 is a block diagram showing a configuration of a second exemplary embodiment of the present invention.

【図１０】第２の実施例のカテゴリ指定スイッチの具体
的構成の一例を示す図である。FIG. 10 is a diagram showing an example of a specific configuration of a category designating switch of the second embodiment.

【図１１】第２の実施例の検索テーブル作成処理を説明
するたの図である。FIG. 11 is a diagram illustrating a search table creation process according to the second embodiment.

【図１２】第２の実施例の検索テーブル作成処理を説明
するたの図である。FIG. 12 is a diagram for explaining a search table creation process of the second embodiment.

【図１３】第２の実施例の検索テーブル作成処理を説明
するたの図である。FIG. 13 is a diagram for explaining a search table creation process of the second embodiment.

[Explanation of symbols]

１０，４０…テキストデータ発生装置２０，５０…テキスト音声変換装置１１，４１…ファイル記憶部１２，２２，４２，５２…制御部１３，２１，４３，５１…データ入出力部２３，５３…固定辞書記憶部２４，５４…ユーザ辞書記憶部２５，５５…カテゴリ選択部２６…優先順位レジスタ２７，５７…内部バッファ２８、５８…音声合成部２９，５９…音声出力部５６…検索テーブル記憶部２３Ａ，２４Ａ，２３Ｂ，２４Ｂ，２３Ｃ，２４Ｃ，
…，５３Ａ，５４Ａ，５３Ｂ，５４Ｂ，５３Ｃ，５４
Ｃ，… …単語記憶領域６０…カテゴリ指定スイッチ10, 40 ... Text data generator 20, 50 ... Text-to-speech converter 11, 41 ... File storage unit 12, 22, 42, 52 ... Control unit 13, 21, 43, 51 ... Data input / output unit 23, 53 ... Fixed Dictionary storage unit 24, 54 ... User dictionary storage unit 25, 55 ... Category selection unit 26 ... Priority register 27, 57 ... Internal buffer 28, 58 ... Voice synthesis unit 29, 59 ... Voice output unit 56 ... Search table storage unit 23A , 24A, 23B, 24B, 23C, 24C,
..., 53A, 54A, 53B, 54B, 53C, 54
C, ... Word storage area 60 ... Category designation switch

Claims

[Claims]

1. A text analysis device for analyzing text data using a word dictionary, comprising: a plurality of word storage areas corresponding to a plurality of categories for word classification, and a category corresponding to each word storage area. A word dictionary storing means for storing words belonging to, category selecting means for selecting a category to be used in the analysis of the text data from the plurality of categories, and a category selecting means corresponding to the category selected by the category selecting means. A text analysis device, comprising: a text analysis unit that analyzes the text data using a word stored in a word storage area.

2. The category selecting means outputs a category specifying information for specifying a category to be used together with text data, and a category for extracting the category specifying information from an output of the data outputting means. The text analysis device according to claim 1, further comprising: designated information extracting means; and selecting means for selecting a category to be used based on the category specifying information extracted by the category specifying information extracting means.

3. The category selecting means comprises: a category specifying means for a user to specify a use category; and a selecting means for selecting a category specified by the category specifying means. 1
The described text analysis device.

4. The text analysis means is configured to analyze the text data while referring to a word belonging to a category selected by the category selection means in accordance with a predetermined priority order. The text analysis device according to claim 1.

5. The text analysis means creates a search table for creating a search table for searching a target word based on the category selected by the category selection means and the stored contents of the dictionary storage means. 2. The text according to claim 1, further comprising: a means and an analyzing means for analyzing the text data by referring to the word dictionary using the search table generated by the search table generating means. Analyzer.