JP2000207395A

JP2000207395A - Device and method for analyzing japanese language and storage medium recording japanese language analyzing program

Info

Publication number: JP2000207395A
Application number: JP11010150A
Authority: JP
Inventors: Keizo Sato; 圭三佐藤
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 1999-01-19
Filing date: 1999-01-19
Publication date: 2000-07-28

Abstract

PROBLEM TO BE SOLVED: To provide a device and method for analyzing Japanese language by which a dictionary storing area can be reduced and the processing efficiency can be improved and a storage medium on which a Japanese language analyzing program is recorded. SOLUTION: In a Japanese language analyzing device having two dictionaries for KANA/KANJI conversion and for Japanese language analysis, the KANA/ KANJI conversion index storing section 3 of a KANA/KANJI-converting dictionary section 2 and the Japanese language analyzing dictionary index storing section 7 of a Japanese language analyzing dictionary section 6 are integrated by means of an index information merging section 10. Therefore, the used amount of memories is significantly reduced, because the index information of the two dictionaries is commonly used.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、日本語から他の言
語への翻訳を行う機械翻訳装置や、入力した日本語に対
して入力ミスを指摘・修正する日本語校正支援装置など
に組み込まれる日本語解析装置および日本語解析方法な
らびに日本語解析プログラムを記録した記録媒体に関す
るものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention is incorporated in a machine translation device for translating Japanese into another language, a Japanese proofreading support device for pointing out and correcting an input error in input Japanese, and the like. The present invention relates to a Japanese language analyzing device, a Japanese language analyzing method, and a recording medium on which a Japanese language analyzing program is recorded.

【０００２】[0002]

【従来の技術】近年、パーソナルコンピュータの普及や
情報通信ネットワークの発達に伴い、日本語から他国語
へ翻訳する機械翻訳装置や、日本語の入力ミスを修正・
指摘したりする日本語校正支援装置などに代表される日
本語処理装置が急速に普及している。日本語処理装置で
は、処理対象とする日本語文の構文・意味構造を把握す
る、いわゆる日本語解析を行う必要があり、全ての日本
語処理装置では日本語解析装置を組み込んだ形で実現さ
れている。2. Description of the Related Art In recent years, with the spread of personal computers and the development of information and communication networks, machine translation devices for translating Japanese into other languages and correction of Japanese input errors have been corrected.
Japanese-language processing devices such as the Japanese-language proofreading support devices that are pointed out have rapidly spread. It is necessary for Japanese language processors to perform the so-called Japanese analysis, which grasps the syntax and semantic structure of the Japanese sentence to be processed, and all Japanese processors are implemented with a built-in Japanese language analyzer. I have.

【０００３】日本語解析装置は、文字列として与えられ
た日本語文を、（１）単語辞書を参照しながら単語単位
に分割する形態素解析処理、（２）分割された単語か
ら、単語間の係り受け解析、意味解析を行い、一意な構
文・意味構造を決定する構文・意味解析処理、を順次適
用することにより実現されている。[0003] The Japanese language analyzing apparatus includes: (1) a morphological analysis process for dividing a Japanese sentence given as a character string into words while referring to a word dictionary; and (2) a relation between words based on the divided words. This is realized by sequentially performing syntax and semantic analysis processing for performing a receiving analysis and a semantic analysis and determining a unique syntax and semantic structure.

【０００４】形態素解析処理では、単語辞書から入力文
中の任意の位置から始る単語を全て検索し、構文・意味
解析処理に渡す。そして、構文・意味解析処理では、こ
の単語候補から品詞などの文法的制約、意味的な制約を
満たす構文・意味構造、すなわち用言を中心としてその
他の語が用言に対してどのような意味的役割を果たして
いるかを木構造として示した図９のような構造を出力す
ることになる。In the morphological analysis process, all words starting from an arbitrary position in an input sentence are searched from a word dictionary and passed to a syntax / semantic analysis process. Then, in the syntactic and semantic analysis processing, a syntactic and semantic structure that satisfies grammatical constraints such as part of speech and semantic constraints from this word candidate, that is, A structure such as that shown in FIG. 9 showing whether the object plays a role is shown as a tree structure.

【０００５】この際、構文・意味解析処理では、単語の
品詞や語義（単語の意味）を決定することを行う。そし
て、日本語解析では、各単語の係り受け関係が木構造の
上位下位関係に表され、各単語の意味関係をノードのリ
ンクに示す図９のような構文・意味構造を出力すること
を最終の目的とする。At this time, in the syntax / semantic analysis processing, the part of speech and the meaning of the word (the meaning of the word) are determined. In the Japanese analysis, the dependency relation of each word is represented by the upper and lower relations of the tree structure, and the syntactic and semantic structure as shown in FIG. The purpose of.

【０００６】しかし、全く同じ構文であっても、文の構
成単語により異なった構文・意味構造を持つことがあ
る。例えば、次に示す文などである。However, even if the syntax is exactly the same, there are cases where the syntax and semantic structure differ depending on the constituent words of the sentence. For example, the following statement is used.

【０００７】（１）彼は目が黒い。(1) He has black eyes.

【０００８】（２）彼は彼女が好きだ。(2) He likes her.

【０００９】上記２つの文は、「〜は〜が述部」という
全く等価な構文であるにもかかわらず、その構文・意味
構造は図９、図１０に示すように全く異なった構造をし
ている。これは文の構成単語の意味が異なっていること
に起因する。逆に言うと、単語の意味、すなわち語義が
明確に定まらなければ、構文・意味解析の最終目標であ
る構文・意味構造を出力することはできないということ
になる。すなわち、この語義決定が日本語解析処理の重
要課題となっており、その解析精度に大きな影響を与え
ている。[0009] Although the above two sentences have completely equivalent syntaxes of "~ wa predicate", their syntax and semantic structure have completely different structures as shown in Figs. ing. This is because the meanings of the constituent words of the sentence are different. In other words, if the meaning of the word, that is, the meaning of the word, is not clearly defined, the syntactic / semantic structure that is the ultimate goal of the syntactic / semantic analysis cannot be output. That is, this semantic determination is an important issue in Japanese language analysis processing, and greatly affects the analysis accuracy.

【００１０】ここで、従来の日本語解析における語義決
定に関して説明する。Here, a description will be given of the semantic determination in the conventional Japanese analysis.

【００１１】従来、日本語解析装置では単語の品詞や語
義を決定するために、枝刈りという手法を用いて可能性
の低いものを捨て去り、残されたものの中から最も可能
性の高いものを選択して最終的な解とする。Conventionally, in order to determine the part of speech and the meaning of a word, a Japanese parsing apparatus discards low-probability ones using a method of pruning and selects the most probable one from the remaining ones. To make the final solution.

【００１２】例えば、「彼の頭はすばらしい」という入
力文において、「頭」という単語には、名詞と接尾語と
いう解釈が存在する。構文・意味解析処理では、まず
『接尾語は、自立語の直後に接続する』という文法的制
約に基づき接尾語の解釈を捨て去る。次に、名詞の語義
を決定するのだが、「頭」には「体の部分」「物の先
端」「頭脳」「職人の頭領」など種々の語義が存在す
る。この語義から最終的に一意の語義を決定するため
に、現在の日本語解析処理では用言「すばらしい」の意
味制約に適合するか否かにより行っている。通常、用言
に関する辞書情報としては用言が取り得る格の意味制約
が記述されている。例では、用言「すばらしい」に関し
て『ガ格に対する制約として「人」や「頭脳」がき得
る』という意味制約が記述されており、構文・意味解析
処理ではこの意味制約に対して入力文のガ格である
「頭」の複数の語義から上記制約条件に適合する意味、
つまり語義を選び出す。例では、用言「すばらしい」に
記述された意味制約に適合する「頭脳」「職人の頭領」
などが選び出されることになる。以上の様な制約条件を
繰り返し適用し、最終的に全ての単語の語義が唯一に決
められるまで行われる。For example, in the input sentence "His head is wonderful", the word "head" has the interpretation of a noun and a suffix. In the syntactic and semantic analysis processing, first, the interpretation of the suffix is discarded based on the grammatical constraint that “the suffix connects immediately after the independent word”. Next, the meaning of a noun is determined. There are various meanings of "head", such as "body part", "tip of object", "brain", and "head of craftsman". In order to finally determine a unique meaning from this meaning, the current Japanese analysis processing is performed based on whether or not the semantic constraint of the declinable word “great” is met. Usually, semantic restrictions on cases that can be taken by the verbal are described as dictionary information on the verbal. In the example, the semantic constraint that "a person or a brain can come as a constraint on a ga case" is described for the verb "excellent". Meanings that meet the above constraints from the plural meanings of the case "head",
In other words, pick out the meaning. In the example, "the brain" and "the artisan's leadership" conform to the semantic constraints described in the adjective "great."
Will be selected. The above-described constraints are repeatedly applied until the meanings of all words are finally determined uniquely.

【００１３】以上のように、日本語解析では単語の語義
決定を行う必要があるが、上記例にも示したように、単
語の語義を決定する場合、辞書を検索した結果得られ
る、単語の品詞、語義全てを対象として行われる。とこ
ろが、辞書検索時には、入力文中に含まれる文字列、つ
まり単語の字面のみを対象として辞書検索し、辞書に登
録された単語全てを取り出すため、単語の語義決定候補
を無作為に増大させる可能性がある。語義決定処理にお
いては、この語義決定候補に対して可能な限りの組み合
わせを試しながら語義決定を行う必要があるため、結果
として、語義毛決定処理の効率・精度の低下を招くとい
う問題が発生する。この問題を解決するために、情報端
末機器に付随されたキーボードや音声入力装置などの重
力手段を通して入力された、かな情報を用いて日本語解
析の前段階で予め不必要な解釈（語義）を捨去り、語義
決定候補の数を削減する方法がある。例えば、先の例に
おいてはキーボードを通じて「かれはあたまがすばらし
い」が入力され、仮名漢字変換装置を通して日本語解析
対象文「彼は頭がすばらしい」変換されたものとし、
「頭」に対する読みが「あたま」として記憶部されてい
るものとする。この場合、日本語解析時における辞書検
索に際して、「頭」に対する辞書検索を行う場合、その
読みも同時に参照しながあら検索することにより、「頭
（読み：かしら）」、「頭（読み：とう）」といった単
語を辞書検索結果から除外することが可能となる。結果
的に、語義決定における対象候補の搾り込みが行え、解
析精度、並びに解析効率の向上が行える。つまり、仮名
漢字変換装置と日本語解析装置とを組み合わせることに
より、解析精度・効率の向上が行える。As described above, it is necessary to determine the meaning of a word in the Japanese analysis. However, as shown in the above example, when determining the meaning of a word, the meaning of the word obtained as a result of searching a dictionary is determined. It is performed for all parts of speech and meanings. However, at the time of dictionary search, since the dictionary search is performed only for the character strings included in the input sentence, that is, for the word face of the word, and all words registered in the dictionary are taken out, there is a possibility that the number of word meaning determination candidates may be increased at random. There is. In the semantic determination processing, it is necessary to perform the semantic determination while trying as many combinations as possible for this semantic determination candidate, and as a result, a problem occurs in that the efficiency and accuracy of the meaning hair determination processing are reduced. . In order to solve this problem, unnecessary interpretations (senses) should be made in advance of the Japanese analysis using kana information input through gravity means such as a keyboard and a voice input device attached to the information terminal device. There is a method of discarding and reducing the number of semantic determination candidates. For example, in the above example, it is assumed that “He is wonderful” is input through the keyboard, and the sentence to be analyzed by the Japanese language “He is wonderful” is converted through the kana-kanji conversion device.
It is assumed that the reading for "head" is stored as "head". In this case, when performing a dictionary search for “head” in a dictionary search at the time of Japanese analysis, the search is performed while referring to the pronunciation at the same time, so that “head (read: kashira)”, “head (read: Words) can be excluded from the dictionary search results. As a result, it is possible to squeeze the target candidates in the meaning determination, and to improve the analysis accuracy and the analysis efficiency. That is, by combining the kana-kanji conversion device and the Japanese analysis device, the analysis accuracy and efficiency can be improved.

【００１４】[0014]

【課題を解決しようとする課題】現在、多くの情報機器
に搭載されている仮名漢字変換装置も、いわゆる日本語
処理装置の範疇に属し、内部に構文解析部、構文解析用
の辞書を併せ持っている。ところが、情報機器に新たな
日本語処理装置、例えば機械翻訳装置を組み込む場合、
処理対象こそ、仮名漢字変換装置は仮名文字列、機械翻
訳装置は仮名漢字混じり文字列と異なるが、単語の語彙
情報としては同一のもの（例えば、語彙に情報として）
が存在する。にもかかわらず、仮名漢字変換用の辞書、
機械翻訳用の辞書が存在するため、辞書の記憶領域の無
駄が発生している。At present, kana-kanji conversion devices installed in many information devices also belong to the category of so-called Japanese language processing devices, and have a parsing unit and a parsing dictionary inside. I have. However, when a new Japanese-language processing device, for example, a machine translation device, is incorporated into information equipment,
The processing target is different from the kana-kanji conversion device and the kana-kanji character string, and the machine translation device is different from the kana-kanji mixed character string, but the same word vocabulary information (for example, as vocabulary information)
Exists. Nevertheless, a dictionary for kana-kanji conversion,
Since there is a dictionary for machine translation, the storage area of the dictionary is wasted.

【００１５】これは、現在の辞書検索方式が、辞書デー
タへのアドレス情報であるインデックス情報をメモリ上
に記憶しておき、このインデックスから必要に応じて
（辞書検索要求がなされる都度）辞書データをメモリ上
に取り出すという手法により実現されているため、辞書
が複数存在する場合、このインデックスの記憶領域を多
く必要とするばかりか、個々のアプリケーションプログ
ラムにおいて、それぞれメモリ上に取り出されたデータ
には情報の重複が存在する可能性があるためである。According to the present dictionary search method, index information, which is address information to dictionary data, is stored in a memory, and the dictionary data is read from the index as needed (every time a dictionary search request is made). Is realized by the method of extracting data into the memory. Therefore, when there are a plurality of dictionaries, not only is a large storage area for this index necessary, but also in each application program, the data retrieved from the memory This is because there may be information duplication.

【００１６】さらに、現在の仮名漢字変換、日本語解析
を行うために、それぞれ辞書検索、構文解析を行う必要
があり処理効率の低下を招く。Furthermore, in order to perform the current Kana-Kanji conversion and Japanese analysis, it is necessary to perform a dictionary search and a syntax analysis, respectively, which causes a reduction in processing efficiency.

【００１７】本発明は、仮名漢字変換装置への入力であ
る仮名情報を日本語解析の語義決定処理に利用する場合
に、仮名漢字変換、日本語解析を行う際に必要となる辞
書の語彙データへのインデックスとして使用するメモリ
の使用量を削減でき、情報機器における貴重なリソース
を節約可能な日本語解析装置および日本語解析方法なら
びに日本語解析プログラムを記録した記録媒体を提供す
ることを目的とする。According to the present invention, when kana-kanji information which is an input to a kana-kanji conversion apparatus is used for the semantic determination processing of Japanese-language analysis, vocabulary data of a dictionary necessary for performing kana-kanji conversion and Japanese analysis. The purpose of the present invention is to provide a Japanese-language analyzer, a Japanese-language analysis method, and a recording medium on which a Japanese-language analysis program is recorded, which can reduce the amount of memory used as an index to a computer and save valuable resources in information equipment I do.

【００１８】[0018]

【課題を解決するための手段】この課題を解決するため
に本発明は、仮名文字の入力、仮名漢字変換および日本
語構文解析の開始指示を行う入力部と、仮名文字に対応
する漢字とその品詞や構文情報を記憶した仮名漢字変換
辞書部と、仮名漢字変換辞書部に記憶された単語の語彙
情報の記憶位置を格納した仮名漢字変換辞書インデック
ス記憶部と、仮名漢字変換辞書インデックス記憶部を介
して、取得要求のあった単語の語彙情報を、仮名漢字変
換辞書部から取り出す仮名漢字情報取得部と、仮名漢字
情報取得部より得られた単語語彙情報を基に、入力され
た仮名文字を漢字コードへ変換する仮名漢字変換部と、
単語に対する読みや発音、品詞や文法情報、意味情報を
記憶した日本語解析辞書部と、日本語解析辞書部に記憶
された単語の語彙情報の記憶位置を格納した日本語解析
辞書インデックス記憶部と、日本語解析辞書インデック
ス記憶部を介して、取得要求のあった単語の語彙情報
を、日本語解析辞書部から取り出す日本語解析情報取得
部と、仮名漢字情報取得部より得られた単語語彙情報を
基に、仮名漢字変換部より得られた仮名漢字混じり文を
日本語形態素解析、構文・意味解析を行う日本語解析部
と、仮名漢字変換辞書インデックス記憶部と、日本語解
析辞書インデックス記憶部におけるインデックス情報を
統合するインデックス情報統合部と、仮名漢字変換辞書
部のインデックス、日本語解析辞書部のインデックス
や、仮名漢字変換、及び日本語解析時における解析を記
憶した解析情報記憶部と、日本語解析結果を表示する表
示部と、を有する構成としたものである。SUMMARY OF THE INVENTION In order to solve this problem, the present invention provides an input section for inputting kana characters, instructing the start of kana-kanji conversion and Japanese syntax analysis, a kanji corresponding to a kana character, and A kana-kanji conversion dictionary that stores part-of-speech and syntax information, a kana-kanji conversion dictionary index storage that stores the vocabulary information of words stored in the kana-kanji conversion dictionary, and a kana-kanji conversion dictionary index storage Based on the vocabulary information of the word requested to be obtained from the kana-kanji conversion dictionary, the kana-kanji information acquisition unit and the word kana character input based on the word vocabulary information obtained from the kana-kanji information acquisition unit. A kana-kanji conversion unit for converting to a kanji code,
A Japanese analysis dictionary unit that stores reading and pronunciation, part-of-speech, grammar information, and semantic information for words, and a Japanese analysis dictionary index storage unit that stores the storage location of vocabulary information of words stored in the Japanese analysis dictionary unit , Japanese lexical information acquisition unit that extracts vocabulary information of words requested to be acquired from the Japanese analytic dictionary unit via the Japanese analytic dictionary index storage unit, and word lexical information obtained from the kana / kanji information acquisition unit Based on the Kana-Kanji conversion unit, a Japanese-language analysis unit that performs Japanese morphological analysis, syntactic and semantic analysis of Kana-Kanji mixed sentences, a Kana-Kanji conversion dictionary index storage unit, and a Japanese analysis dictionary index storage unit The index information integration unit that integrates the index information in, the index of the kana-kanji conversion dictionary, the index of the Japanese analysis dictionary, the kana-kanji conversion, An analysis information storage unit that stores time analysis of Japanese analysis, is obtained by a display unit for displaying the Japanese analysis results, the structure having.

【００１９】これにより、仮名漢字変換装置への入力で
ある仮名情報を日本語解析の語義決定処理に利用する場
合に、仮名漢字変換、日本語解析を行う際に必要となる
辞書の語彙データへのインデックスとして使用するメモ
リの使用量を削減でき、情報機器における貴重なリソー
スを節約可能な日本語解析装置が実現できる。Thus, when the kana information input to the kana-kanji conversion device is used for the semantic determination processing of the Japanese analysis, the kana-kanji conversion and the vocabulary data of the dictionary necessary for the Japanese analysis are performed. It is possible to realize a Japanese language analyzer that can reduce the amount of memory used as an index and save valuable resources in information equipment.

【００２０】[0020]

【発明の実施の形態】本発明の請求項１に記載の発明
は、仮名文字の入力、仮名漢字変換および日本語構文解
析の開始指示を行う入力部と、仮名文字に対応する漢字
とその品詞や構文情報を記憶した仮名漢字変換辞書部
と、仮名漢字変換辞書部に記憶された単語の語彙情報の
記憶位置を格納した仮名漢字変換辞書インデックス記憶
部と、仮名漢字変換辞書インデックス記憶部を介して、
取得要求のあった単語の語彙情報を、仮名漢字変換辞書
部から取り出す仮名漢字情報取得部と、仮名漢字情報取
得部より得られた単語語彙情報を基に、入力された仮名
文字を漢字コードへ変換する仮名漢字変換部と、単語に
対する読みや発音、品詞や文法情報、意味情報を記憶し
た日本語解析辞書部と、日本語解析辞書部に記憶された
単語の語彙情報の記憶位置を格納した日本語解析辞書イ
ンデックス記憶部と、日本語解析辞書インデックス記憶
部を介して、取得要求のあった単語の語彙情報を、日本
語解析辞書部から取り出す日本語解析情報取得部と、仮
名漢字情報取得部より得られた単語語彙情報を基に、仮
名漢字変換部より得られた仮名漢字混じり文を日本語形
態素解析、構文・意味解析を行う日本語解析部と、仮名
漢字変換辞書インデックス記憶部と、日本語解析辞書イ
ンデックス記憶部におけるインデックス情報を統合する
インデックス情報統合部と、仮名漢字変換辞書部のイン
デックス、日本語解析辞書部のインデックスや、仮名漢
字変換、及び日本語解析時における解析を記憶した解析
情報記憶部と、日本語解析結果を表示する表示部と、を
有する構成としたものであり、仮名漢字変換装置が既に
組み込まれた情報機器装置に、新に日本語解析装置を組
み込んで、仮名漢字変換装置への入力である仮名情報を
日本語解析の語義決定処理に利用する場合の、メモリ使
用量の削減を図ることができるという作用を有する。DESCRIPTION OF THE PREFERRED EMBODIMENTS The invention according to claim 1 of the present invention provides an input unit for inputting a kana character, a kana-kanji conversion, and a start instruction for Japanese syntax analysis, a kanji corresponding to a kana character, and its part of speech. A kana-kanji conversion dictionary unit that stores word and syntax information, a kana-kanji conversion dictionary index storage unit that stores the storage location of the vocabulary information of the words stored in the kana-kanji conversion dictionary unit, and a kana-kanji conversion dictionary index storage unit. hand,
Kana-kanji information acquisition unit that retrieves the vocabulary information of the requested word from the kana-kanji conversion dictionary unit, and converts the input kana character to kanji code based on the word vocabulary information obtained from the kana-kanji information acquisition unit. A kana-kanji conversion unit for conversion, a Japanese analysis dictionary unit that stores reading and pronunciation, part-of-speech, grammar information, and semantic information for words, and a storage location for vocabulary information of words stored in the Japanese analysis dictionary unit A Japanese analysis dictionary index storage unit, a Japanese analysis information acquisition unit that retrieves the vocabulary information of the requested words from the Japanese analysis dictionary unit via the Japanese analysis dictionary index storage unit, and a kana kanji information acquisition Based on the word vocabulary information obtained from the Kana-Kanji conversion part, the Japanese analysis part that performs Japanese morphological analysis, syntax and semantic analysis of the Kana-Kanji mixed sentence obtained from the Kana-Kanji conversion part, and the Kana-Kanji conversion dictionary Index storage unit, index information integration unit that integrates index information in the Japanese analysis dictionary index storage unit, index of the kana-kanji conversion dictionary unit, index of the Japanese analysis dictionary unit, kana-kanji conversion, and Japanese analysis And a display unit for displaying the results of the Japanese language analysis. When the device is incorporated and the kana information input to the kana-kanji conversion device is used for the semantic determination processing of the Japanese analysis, it has the effect of reducing the amount of memory used.

【００２１】本発明の請求項２に記載の発明は、請求項
１に記載の発明において、入力部により仮名文字が入力
され、仮名漢字変換部が仮名文字を漢字へ変換した仮名
漢字文字列を日本語解析部へ渡す際に、仮名漢字情報取
得部が仮名漢字辞書部から取得した語彙情報も同時に渡
し、日本語解析部が日本語解析辞書部を検索する際に、
仮名漢字変換部から渡された単語の語彙情報との間に重
複する情報が存在するか否かを判断し、重複が存在する
場合には、日本語解析情報取得部は日本語解析辞書部か
ら取得する情報から重複を取り除いて語彙情報の差分情
報だけを取得する構成としたものであり、仮名漢字変換
装置が既に組み込まれた情報機器装置に、新に日本語解
析装置を組み込んで、仮名漢字変換装置への入力である
仮名情報を日本語解析の語義決定処理に利用する場合
に、辞書データをメモリ上に取り出す際のメモリ使用量
の増加を抑制するとともに、辞書データ取得時間を短縮
できるという作用を有する。According to a second aspect of the present invention, in the first aspect of the invention, a kana character is inputted by an input unit, and a kana-kanji conversion unit converts a kana-kanji character string obtained by converting a kana character into a kanji. When passing to the Japanese analysis unit, the kana-kanji information acquisition unit also passes the vocabulary information acquired from the kana-kanji dictionary unit at the same time, and when the Japanese analysis unit searches the Japanese analysis dictionary unit,
Judge whether or not there is overlapping information between the vocabulary information of the word passed from the Kana-Kanji conversion unit, and if there is an overlap, the Japanese analysis information acquisition unit returns from the Japanese analysis dictionary unit This is a configuration that removes duplication from the information to be acquired and acquires only the difference information of the vocabulary information. When using kana information, which is input to the conversion device, for the semantic determination processing of Japanese analysis, it is possible to suppress an increase in memory usage when extracting dictionary data into a memory, and to reduce dictionary data acquisition time. Has an action.

【００２２】本発明の請求項３に記載の発明は、請求項
１に記載の発明において、入力部により仮名文字が入力
され、仮名漢字変換部が仮名文字を漢字へ変換した仮名
漢字文字列を日本語解析部へ渡す際に、仮名漢字変換部
が仮名文字列を漢字に変換する場合行った解析結果をも
同時に渡し、日本語解析部はこの解析情報を用いて、仮
名漢字変換部が出力した仮名漢字混じり文を構文・意味
解析する構成としたものであり、日本語解析装置が仮名
漢字変換から渡された構文解析結果を再利用して、余分
な構文解析・意味解析を行わないため、日本語解析装置
における処理時間の短縮を図ることができるという作用
を有する。According to a third aspect of the present invention, in the first aspect of the invention, a kana kanji character string is inputted by an input unit, and a kana kanji conversion unit converts the kana kanji character string into a kanji character. When the kana-kanji conversion unit converts the kana character string to kanji when passing it to the Japanese analysis unit, it also passes the analysis result that was performed at the same time, and the Japanese analysis unit uses this analysis information and the kana-kanji conversion unit outputs It is designed to analyze syntax and semantics of the sentence mixed with kana-kanji characters, and the Japanese parser reuses the syntax analysis result passed from kana-kanji conversion and does not perform extra syntax analysis and semantic analysis. Thus, the processing time of the Japanese language analyzer can be reduced.

【００２３】以下、本発明の各実施の形態について、図
面を参照しながら説明する。Hereinafter, embodiments of the present invention will be described with reference to the drawings.

【００２４】（実施の形態１）以下、本発明の第１の実
施の形態について説明する。Embodiment 1 Hereinafter, a first embodiment of the present invention will be described.

【００２５】図１は本発明の第１の実施の形態における
日本語解析装置の機能ブロック図であり、機能手段によ
る構成を示したものである。FIG. 1 is a functional block diagram of the Japanese language analyzing apparatus according to the first embodiment of the present invention, and shows the configuration of functional units.

【００２６】図１において、１は仮名文字の入力や、仮
名漢字変換および日本語構文解析の開始指示などを行う
入力部である。２は仮名文字に対応する漢字とその品詞
や構文情報を記憶した仮名漢字変換辞書部である。３は
仮名漢字変換辞書部２に記憶された単語の語彙情報の記
憶位置を格納した仮名漢字変換辞書インデックス記憶部
である。４は仮名漢字変換辞書インデックス記憶部３を
介して、取得要求のあった単語を、仮名漢字変換辞書部
２から取り出す仮名漢字情報取得部である。５は仮名漢
字情報取得部４より得られた単語語彙情報を基に、入力
された仮名文字を漢字コードへ変換する仮名漢字変換部
である。In FIG. 1, reference numeral 1 denotes an input unit for inputting kana characters and instructing the start of kana-kanji conversion and Japanese syntax analysis. Reference numeral 2 denotes a kana-kanji conversion dictionary storing the kanji corresponding to the kana character and its part of speech and syntax information. Reference numeral 3 denotes a kana-kanji conversion dictionary index storage unit that stores the storage locations of the vocabulary information of the words stored in the kana-kanji conversion dictionary unit 2. Reference numeral 4 denotes a kana-kanji information acquisition unit for extracting a word requested to be acquired from the kana-kanji conversion dictionary unit 2 via the kana-kanji conversion dictionary index storage unit 3. Reference numeral 5 denotes a kana-kanji conversion unit that converts input kana characters into kanji codes based on the word vocabulary information obtained from the kana-kanji information acquisition unit 4.

【００２７】６は単語に対する読みや発音、品詞や文法
情報、意味情報などを記憶した日本語解析辞書部であ
る。７は日本語解析辞書部６に記憶された単語の語彙情
報の記憶位置を格納した日本語解析辞書インデックス記
憶部である。８は日本語解析辞書インデックス記憶部７
を介して、取得要求のあった単語を、日本語解析辞書部
６から取り出す日本語解析情報取得部である。９は日本
語解析情報取得部８より得られた単語語彙情報を基に、
仮名漢字変換部５より得られた仮名漢字混じり文を日本
語形態素解析、構文・意味解析を行う日本語解析部であ
る。Reference numeral 6 denotes a Japanese analysis dictionary unit which stores readings and pronunciations of words, parts of speech, grammar information, meaning information, and the like. Reference numeral 7 denotes a Japanese analysis dictionary index storage unit that stores the storage locations of the vocabulary information of the words stored in the Japanese analysis dictionary unit 6. 8 is a Japanese analysis dictionary index storage unit 7
Is a Japanese-language analysis information acquisition unit that retrieves a word requested to be acquired from the Japanese-language analysis dictionary unit 6 via the. 9 is based on the word vocabulary information obtained from the Japanese analysis information acquisition unit 8,
This is a Japanese analysis unit that performs Japanese morphological analysis and syntax / semantic analysis on the sentence mixed with kana / kanji obtained from the kana / kanji conversion unit 5.

【００２８】１０は仮名漢字変換辞書インデックス記憶
部３と、日本語解析辞書インデックス記憶部７における
インデックス情報をマージするインデックス情報マージ
部である。１１は仮名漢字変換辞書部２のインデック
ス、日本語解析部９のインデックスや、仮名漢字変換お
よび日本語解析時における解析情報を記憶した解析情報
記憶部である。１２は日本語解析結果を表示する表示部
であり、１３は前述の各手段の動作を制御する制御部で
ある。Reference numeral 10 denotes an index information merge unit for merging the index information in the kana-kanji conversion dictionary index storage unit 3 and the index information in the Japanese analysis dictionary index storage unit 7. Reference numeral 11 denotes an analysis information storage unit that stores an index of the kana-kanji conversion dictionary unit 2, an index of the Japanese analysis unit 9, and analysis information at the time of kana-kanji conversion and Japanese analysis. Reference numeral 12 denotes a display unit for displaying a Japanese analysis result, and reference numeral 13 denotes a control unit for controlling the operation of each unit described above.

【００２９】図２は本発明の第１の実施の形態における
登録装置及び翻訳装置の回路ブロック図であり、ハード
ウェアによる構成を示したものである。FIG. 2 is a circuit block diagram of the registration device and the translation device according to the first embodiment of the present invention, and shows a hardware configuration.

【００３０】図２において、１４はキーボードやポイン
ティング・デバイスなどの入力装置である。１５は陰極
線管ディスプレイ（ＣＲＴ）などの表示装置である。１
６は各種のプログラムを実行することにより装置を制御
する中央処理装置（ＣＰＵ）である。１７はデータを書
き込み可能なメモリであるランダム・アクセス・メモリ
（ＲＡＭ）である。１８はデータの読み出し専用メモリ
であるリード・オンリー・メモリ（ＲＯＭ）である。In FIG. 2, reference numeral 14 denotes an input device such as a keyboard or a pointing device. Reference numeral 15 denotes a display device such as a cathode ray tube display (CRT). 1
Reference numeral 6 denotes a central processing unit (CPU) that controls the apparatus by executing various programs. Reference numeral 17 denotes a random access memory (RAM) which is a memory to which data can be written. Reference numeral 18 denotes a read-only memory (ROM) which is a data read-only memory.

【００３１】１９はＣＤ−ＲＯＭなど、データを記録す
る記録媒体であり、２０はＣＤ−ＲＯＭドライブなど、
記録媒体１９からデータを読み取る読取装置、２１はデ
ータ・バスである。Reference numeral 19 denotes a recording medium for recording data such as a CD-ROM, and 20 denotes a recording medium such as a CD-ROM drive.
A reading device for reading data from the recording medium 19, 21 is a data bus.

【００３２】ここで、図１の機能手段と図２のハードウ
ェアとの対応関係を説明する。Here, the correspondence between the functional means of FIG. 1 and the hardware of FIG. 2 will be described.

【００３３】図１および図２において、入力部１は入力
装置１４により、表示部１２、表示装置１５により、解
析情報記憶部１１はＲＡＭ１８によりそれぞれ実現され
る。1 and 2, the input unit 1 is realized by the input device 14, the display unit 12 and the display device 15, and the analysis information storage unit 11 is realized by the RAM 18.

【００３４】仮名漢字変換辞書部２、仮名漢字変換辞書
インデックス記憶部３、日本語解析辞書部６、日本語解
析辞書インデックス記憶部７は、ＲＡＭ１７、ＲＯＭ１
８の何れかに記憶されることにより実現される。The kana-kanji conversion dictionary unit 2, the kana-kanji conversion dictionary index storage unit 3, the Japanese analysis dictionary unit 6, and the Japanese analysis dictionary index storage unit 7 are a RAM 17 and a ROM 1.
8 is realized.

【００３５】仮名漢字情報取得部４、仮名漢字変換部
５、日本語解析情報取得部８、日本語解析部９、インデ
ックス情報マージ部１０、制御部１３は、ＣＰＵ１６
が、ＲＡＭ１７およびＲＯＭ１８とデータのやりとりを
行いながら、ＲＯＭ１８に記憶された各種のプログラム
を実行することにより実現される。The kana-kanji information acquisition unit 4, the kana-kanji conversion unit 5, the Japanese analysis information acquisition unit 8, the Japanese analysis unit 9, the index information merge unit 10, and the control unit 13
Is realized by executing various programs stored in the ROM 18 while exchanging data with the RAM 17 and the ROM 18.

【００３６】なお、本実施の形態では、ＣＰＵ１６はＲ
ＯＭ１７に記憶されている各種のプログラムを実行する
形態を示しているが、プログラムは記録媒体１９に記録
されているものでも良い。Note that, in the present embodiment, the CPU 16
Although the form in which various programs stored in the OM 17 are executed is shown, the programs may be recorded on the recording medium 19.

【００３７】この場合、ＣＰＵ１６は、読取装置２０を
介して記録媒体１９から読み込んだプログラムデータを
ＲＡＭ１７に展開し、実行する形態となる。このような
形態とすることにより、本発明を汎用コンピュータで容
易に実現することが可能となる。In this case, the CPU 16 develops the program data read from the recording medium 19 via the reading device 20 into the RAM 17 and executes it. With such an embodiment, the present invention can be easily realized by a general-purpose computer.

【００３８】以上のように構成される日本語解析装置に
ついて、以下にその動作をフローチャートを用いて説明
する。なお、以下に用いる動作フローチャートは、ＣＰ
Ｕ１６がＲＯＭ１８に記憶された各種のプログラムを実
行し、装置を制御する様子を示したものである。The operation of the Japanese language analyzer configured as described above will be described below with reference to flowcharts. The operation flow chart used below is CP
The U16 executes various programs stored in the ROM 18 to control the apparatus.

【００３９】図３は本発明の第１の実施の形態における
日本語解析装置の動作フローチャートであり、入力され
る文字列について、仮名漢字変換を行い、日本語解析す
るまでの動作を示している。FIG. 3 is an operation flowchart of the Japanese-language analyzing apparatus according to the first embodiment of the present invention, showing the operation from input character string conversion to Kana-Kanji conversion and Japanese-language analysis. .

【００４０】本実施の形態では、「かれはあたまがすば
らしい」という仮名文字を、仮名漢字変換を行って日本
語解析する場合を考える。In the present embodiment, a case is considered in which Japanese characters are analyzed by performing a kana-kanji conversion on a kana character "Kare wa Amaama".

【００４１】図３に示すように、まず、操作者により入
力部１から日本語解析装置の起動が指示されると、制御
部１３は、仮名漢字変換部５、日本語解析辞書部６を起
動する（ステップ１）。この際、制御部１３は、既に仮
名漢字変換装置が起動されているかどうかを判断し、起
動されていれば、仮名漢字変換辞書部２を辞書検索する
ためのインデックス情報をＲＡＭ１７からクリアするの
と同時に、制御部１３が仮名漢字変換辞書インデックス
記憶部３を、日本語解析辞書インデックス記憶部７のイ
ンデックス情報を解析情報記憶部１１上に記憶させる。As shown in FIG. 3, when the operator instructs to start the Japanese language analyzer from the input unit 1, the control unit 13 starts the kana-kanji conversion unit 5 and the Japanese analysis dictionary unit 6. (Step 1). At this time, the control unit 13 determines whether the kana-kanji conversion device has already been activated, and if so, clears the index information for performing a dictionary search of the kana-kanji conversion dictionary unit 2 from the RAM 17. At the same time, the control unit 13 stores the kana-kanji conversion dictionary index storage unit 3 and the index information of the Japanese analysis dictionary index storage unit 7 on the analysis information storage unit 11.

【００４２】ここで、インデックス情報マージ部１０
は、制御部１３が、仮名漢字変換辞書インデックス記憶
部３を、日本語解析辞書インデックス記憶部７のインデ
ックス情報を解析情報記憶部１１上に記憶する際、ここ
のインデックス情報に、一致する文字列が存在するかい
なかを判断し、一致する場合、インデックスをマージし
て一致部分を取り除いた形で、解析情報記憶部１１上に
記憶する（Ｓ２、インデックス情報マージステップ）。
例えば、仮名漢字変換辞書インデックス記憶部３には、
仮名文字である単語に対して、その該当する単語の漢字
表記や品詞等文法情報が記載された仮名漢字変換辞書部
２へのインデックスが記憶されている（図４）。Here, the index information merging unit 10
When the control unit 13 stores the kana-kanji conversion dictionary index storage unit 3 and the index information of the Japanese analysis dictionary index storage unit 7 in the analysis information storage unit 11, a character string matching the index information Is determined, and if they match, the index is merged and stored in the analysis information storage unit 11 in a form in which the matching part is removed (S2, index information merging step).
For example, in the kana-kanji conversion dictionary index storage unit 3,
For a word that is a kana character, an index into the kana-kanji conversion dictionary unit 2 in which grammatical information such as the kanji notation and part of speech of the corresponding word is described is stored (FIG. 4).

【００４３】一方、日本語解析辞書インデックス記憶部
７には、単語に対して、その該当する単語の品詞等の文
法情報や、翻訳に必要な訳語情報が記載された日本語解
析辞書部６へのインデックスが記憶されている（図
５）。インデックス情報マージ部１０は、上記２つのイ
ンデックス情報から、一致部分を取り去り、すなわちイ
ンデックスキーとなる日本語文字列の部分を共有するこ
とにより、図６に示すようなインデックスを作成し、解
析情報記憶部１１へ記憶する。On the other hand, the Japanese analysis dictionary index storage unit 7 stores, for each word, a Japanese analysis dictionary unit 6 in which grammatical information such as the part of speech of the corresponding word and translation information necessary for translation are described. Are stored (FIG. 5). The index information merging unit 10 creates an index as shown in FIG. 6 by removing the matching part from the two pieces of index information, that is, by sharing the part of the Japanese character string serving as the index key, and storing the analysis information. It is stored in the unit 11.

【００４４】ここで、仮名漢字変換辞書部２には、仮名
漢字変換辞書インデックス記憶部３に記憶された位置情
報と対応して、図７に示す仮名漢字変換辞書部２のよう
に、また、日本語解析辞書部６には、日本語解析辞書イ
ンデックス記憶部７に記憶された位置情報と対応して、
図８に示す日本語解析辞書部６のように記憶されてい
る。Here, the kana-kanji conversion dictionary unit 2 stores the kana-kanji conversion dictionary index unit 3 as shown in FIG. In the Japanese analysis dictionary section 6, corresponding to the position information stored in the Japanese analysis dictionary index storage section 7,
It is stored like the Japanese analysis dictionary unit 6 shown in FIG.

【００４５】次に、ユーザが入力部１により仮名文字列
を入力する（Ｓ３、入力ステップ）。入力された文字列
は、仮名漢字変換部５を介して、また文節ごとに変換す
る場合は、ユーザの介在を経て（Ｓ４）、漢字へ変換さ
れる。この際、入力された文字列に対する仮名漢字変換
辞書部２の検索を行い（Ｓ５）、インデックス情報マー
ジステップによりマージされた、仮名漢字変換辞書部２
へのインデックスに基づいて、仮名漢字変換辞書部２か
ら、検索に成功した語彙情報を随時取り出しながら（Ｓ
６、仮名漢字情報取得ステップ）、仮名漢字変換部５が
入力された仮名文字列を漢字へと変換することを行う
（Ｓ７、仮名漢字変換ステップ）。Next, the user inputs a kana character string using the input unit 1 (S3, input step). The input character string is converted to kanji via the kana-kanji conversion unit 5 and, if it is converted for each phrase, via user intervention (S4). At this time, the kana-kanji conversion dictionary unit 2 is searched for the input character string (S5), and the kana-kanji conversion dictionary unit 2 merged in the index information merge step.
While vocabulary information that has been successfully retrieved is retrieved from the kana-kanji conversion dictionary unit 2 as needed based on the index to (S
6. Kana-kanji information acquisition step), and the kana-kanji conversion unit 5 converts the input kana character string into kanji (S7, kana-kanji conversion step).

【００４６】次に、ユーザが入力部１により日本語解析
指示を行うと（Ｓ８）、入力された文字列に対する日本
語解析辞書部６の検索を行い（Ｓ９）、インデックス情
報マージステップによりマージされた、日本語解析辞書
へのインデックスに基づいて、日本語解析辞書部６か
ら、検索に成功した語彙情報を随時取り出しながら（Ｓ
１０、日本語解析情報取得ステップ）、日本語解析部９
が仮名漢字変換部５より得られるを仮名漢字混じり文字
列の構文・意味解析を行う（Ｓ１１、日本語解析ステッ
プ）。Next, when the user gives a Japanese language analysis instruction using the input unit 1 (S8), the input character string is searched in the Japanese language analysis dictionary unit 6 (S9) and merged in the index information merging step. In addition, based on the index into the Japanese analysis dictionary, the vocabulary information that has been successfully retrieved is retrieved from the Japanese analysis dictionary unit 6 as needed (S
10, Japanese analysis information acquisition step), Japanese analysis unit 9
Performs the syntax and semantic analysis of the character string mixed with kana-kanji obtained from the kana-kanji conversion unit 5 (S11, Japanese analysis step).

【００４７】最終的に図７のような構造を表示部１２に
表示することにより日本語解析処理を終了する。Finally, the structure as shown in FIG. 7 is displayed on the display unit 12 to terminate the Japanese language analysis processing.

【００４８】以上のように、本実施の形態によれば、仮
名漢字変換辞書部２、日本語解析辞書部６に対するイン
デックスキーである日本語文字列を共有することができ
るため、本発明の日本語解析装置の動作時におけるメモ
リ使用量を大幅に削減することがでる。As described above, according to the present embodiment, a Japanese character string which is an index key for the kana-kanji conversion dictionary unit 2 and the Japanese analysis dictionary unit 6 can be shared. The memory usage during the operation of the word analyzer can be greatly reduced.

【００４９】（実施の形態２）上記第一の実施の形態に
おいて、仮名漢字変換辞書部２から取得した語彙情報
を、日本語解析部９において利用することにより、日本
語解析時におけるメモリ使用量の削減、およびデータ取
得時間の短縮が可能であり、この動作を図３のフローチ
ャートを用いて説明する。(Embodiment 2) In the first embodiment, the vocabulary information obtained from the Kana-Kanji conversion dictionary unit 2 is used in the Japanese analysis unit 9 so that the memory usage during the Japanese analysis is And the data acquisition time can be reduced. This operation will be described with reference to the flowchart of FIG.

【００５０】第二の実施の形態では、第一の実施の形態
で示したようにＳ１からＳ７を介して、仮名文字列の入
力から仮名漢字変換までが行われた状態であるとする。In the second embodiment, as shown in the first embodiment, it is assumed that the processing from input of a kana character string to conversion of a kana-kanji character has been performed through S1 to S7.

【００５１】ここで、仮名漢字変換部５は、変換後の仮
名漢字混じり文字列と共に、仮名漢字変換に際して、使
用した語彙情報も同時に、日本語解析辞書部６へ渡す。
日本語解析部９では日本語辞書検索に際して、渡された
語彙情報が解析に必要な情報を含んでいるか否かを、渡
された語彙情報の内の、品詞などの文法的情報から判断
し（Ｓ８）、含んでいない単語に対してだけ、日本語解
析辞書検索を行い（Ｓ９）、日本語解析辞書部６から語
彙データを取り出す（Ｓ１０）。そして、日本語解析部
９はこの語彙情報を使い、構文・意味解析を行い（Ｓ１
１）、結果を表示する。Here, the kana-kanji conversion unit 5 simultaneously sends the vocabulary information used in the kana-kanji conversion to the Japanese analysis dictionary unit 6 together with the converted kana-kanji mixed character string.
At the time of Japanese dictionary search, the Japanese parsing unit 9 determines whether or not the passed vocabulary information includes information necessary for analysis from grammatical information such as part of speech in the passed vocabulary information ( S8) A Japanese analysis dictionary search is performed only for words not included (S9), and vocabulary data is extracted from the Japanese analysis dictionary unit 6 (S10). Then, the Japanese parsing unit 9 performs a syntax / semantic analysis using the vocabulary information (S1).
1) Display the result.

【００５２】以上のように、本実施の形態によれば、日
本語解析の辞書検索において、仮名漢字変換時に取得し
た語彙情報を利用することにより、辞書検索回数の削
減、すなわち日本語解析の勝利効率の向上と、日本語解
析辞書部６より取り出す語彙情報の重複を防ぐ、すなわ
ちメモリ使用量の削減が可能となる。As described above, according to the present embodiment, in the dictionary search for Japanese analysis, the vocabulary information obtained at the time of kana-kanji conversion is used, thereby reducing the number of dictionary searches, ie, winning the Japanese analysis. It is possible to improve efficiency and prevent duplication of vocabulary information extracted from the Japanese analysis dictionary unit 6, that is, reduce the amount of memory used.

【００５３】（実施の形態３）上記第一の実施の形態に
おいて、仮名漢字変換部５が仮名漢字変換に際して行っ
た解析処理を、日本語解析部９において利用することに
より、日本語解析時における処理効率の向上が可能であ
り、この動作を図３のフローチャートを用いて説明す
る。(Embodiment 3) In the first embodiment, the analysis performed by the kana-kanji conversion unit 5 at the time of kana-kanji conversion is used by the Japanese analysis unit 9 so that the Japanese-language analysis unit 9 The processing efficiency can be improved, and this operation will be described with reference to the flowchart of FIG.

【００５４】第三の実施の形態では、第一の実施の形態
で示したようにＳ１からＳ７を介して、仮名文字列の入
力から仮名漢字変換までが行われた状態であるとする。
ここで、仮名漢字変換部５は、変換後の仮名漢字混じり
文字列と共に、仮名漢字変換に際して、使用した構文解
析結果も同時に、日本語解析部９に渡す。この際、ユー
ザが文節ごとの仮名漢字変換を行ったのであれば、文節
毎に変換した結果を、一文を入力したのであればその結
果を日本語解析部９に渡す。In the third embodiment, as shown in the first embodiment, it is assumed that the processing from the input of the kana character string to the conversion of the kana to kanji has been performed through S1 to S7.
Here, the kana-kanji conversion unit 5 sends the syntax analysis result used in the kana-kanji conversion together with the converted kana-kanji mixed character string to the Japanese analysis unit 9 at the same time. At this time, if the user has performed kana-kanji conversion for each phrase, the result of conversion for each phrase is passed to the Japanese analysis unit 9 if one sentence has been input.

【００５５】次に日本語解析部９は、仮名漢字変換部５
より渡された仮名漢字文字列に対して、日本語解析辞書
部６を辞書検索し、検索に成功した単語の語彙情報を日
本語解析辞書部６から取り出す。取り出された語彙情報
を基に、仮名漢字変換部５から渡された解析情報を、日
本語解析の過渡状態において利用する。例えば、仮名漢
字変換部５より渡された解析結果が次のような形で文節
毎に渡されたものとする。Next, the Japanese analysis unit 9 converts the kana-kanji conversion unit 5
The kana-kanji character string passed from the dictionary is subjected to a dictionary search in the Japanese analysis dictionary unit 6, and vocabulary information of the successfully searched word is extracted from the Japanese analysis dictionary unit 6. Based on the extracted vocabulary information, the analysis information passed from the kana-kanji conversion unit 5 is used in a transient state of Japanese analysis. For example, it is assumed that the analysis result passed from the kana-kanji conversion unit 5 is passed for each phrase in the following form.

【００５６】名詞句→名詞（彼）＋助詞（は）名詞句→名詞（頭）＋助詞（が）述部→形容詞（すばらしい）ここで、Ａ→Ｂという表記は、右辺に示された語がまと
まって、左辺に示す文法的なまとまりとなる句を構成す
るという意味である。Noun phrase → noun (he) + particle (wa) Noun phrase → noun (head) + particle (ga) predicate → adjective (excellent) Here, the notation A → B is the word shown on the right side This means that they form a grammatical unity phrase on the left side.

【００５７】日本語解析部９は、上記の結果を文節とし
てより上位の文としてのまとめ上げを行い、さらに意味
解析を行う。The Japanese analysis unit 9 collects the above results as phrases and as higher-level sentences, and further performs semantic analysis.

【００５８】また、仮名漢字変換部５より渡された解析
結果が次のような形で文として渡されたとる。It is also assumed that the analysis result passed from the kana-kanji conversion unit 5 is passed as a sentence in the following form.

【００５９】文→名詞句（彼は）＋名詞句（頭が）＋述
部（すばらしい）日本語解析部９は、この結果を基に述部に対する意味解
析を行う。Sentence → noun phrase (he) + noun phrase (head) + predicate (excellent) The Japanese analysis unit 9 performs semantic analysis on the predicate based on the result.

【００６０】以上のように、本実施の形態によれば、仮
名漢字変換部５が仮名漢字変換処理を行った際の構文解
析結果を、日本語解析部９が再利用することにより、日
本語解析における、解析の処理フェーズをスキップする
ことができるため、解析処理速度の向上が望める。As described above, according to the present embodiment, the Japanese language analysis unit 9 reuses the syntax analysis result when the kana-kanji conversion unit 5 performs the kana-kanji conversion process, thereby Since the analysis processing phase in the analysis can be skipped, an improvement in the analysis processing speed can be expected.

【００６１】[0061]

【発明の効果】以上のように本願発明は、仮名文字の入
力、仮名漢字変換および日本語構文解析の開始指示を行
う入力部と、仮名文字に対応する漢字とその品詞や構文
情報を記憶した仮名漢字変換辞書部と、仮名漢字変換辞
書部に記憶された単語の語彙情報の記憶位置を格納した
仮名漢字変換辞書インデックス記憶部と、仮名漢字変換
辞書インデックス記憶部を介して、取得要求のあった単
語の語彙情報を、仮名漢字変換辞書部から取り出す仮名
漢字情報取得部と、仮名漢字情報取得部より得られた単
語語彙情報を基に、入力された仮名文字を漢字コードへ
変換する仮名漢字変換部と、単語に対する読みや発音、
品詞や文法情報、意味情報を記憶した日本語解析辞書部
と、日本語解析辞書部に記憶された単語の語彙情報の記
憶位置を格納した日本語解析辞書インデックス記憶部
と、日本語解析辞書インデックス記憶部を介して、取得
要求のあった単語の語彙情報を、日本語解析辞書部から
取り出す日本語解析情報取得部と、仮名漢字情報取得部
より得られた単語語彙情報を基に、仮名漢字変換部より
得られた仮名漢字混じり文を日本語形態素解析、構文・
意味解析を行う日本語解析部と、仮名漢字変換辞書イン
デックス記憶部と、日本語解析辞書インデックス記憶部
におけるインデックス情報を統合するインデックス情報
統合部と、仮名漢字変換辞書部のインデックス、日本語
解析辞書部のインデックスや、仮名漢字変換、及び日本
語解析時における解析を記憶した解析情報記憶部と、日
本語解析結果を表示する表示部と、を有する構成とした
ものであり、仮名漢字変換装置が既に組み込まれた情報
機器装置に、新に日本語解析装置を組み込んで、仮名漢
字変換装置への入力である仮名情報を日本語解析の語義
決定処理に利用する場合の、メモリ使用量の削減を図る
ことができる。As described above, according to the present invention, the input unit for inputting the kana character, instructing the start of the kana-kanji conversion and the Japanese syntax analysis, and storing the kanji corresponding to the kana character and its part of speech and syntax information are stored. An acquisition request is sent via a kana-kanji conversion dictionary unit, a kana-kanji conversion dictionary index storage unit storing the storage location of the vocabulary information of the words stored in the kana-kanji conversion dictionary unit, and a kana-kanji conversion dictionary index storage unit. The kana-kanji information acquisition unit that extracts the vocabulary information of the words from the kana-kanji conversion dictionary unit, and the kana-kanji character that converts the input kana characters into kanji codes based on the word vocabulary information obtained from the kana-kanji information acquisition unit Conversion unit, reading and pronunciation for words,
Japanese analysis dictionary that stores part-of-speech, grammar information, and semantic information, Japanese analysis dictionary index storage that stores the vocabulary information storage location of words stored in the Japanese analysis dictionary, and Japanese analysis dictionary index Based on the vocabulary information of the word requested to be acquired from the Japanese analysis dictionary unit via the storage unit and the word vocabulary information obtained from the kana kanji information acquisition unit, the kana kanji The sentence mixed with kana and kanji obtained from the conversion unit is analyzed by Japanese morphological
Japanese analysis unit that performs semantic analysis, Kana-Kanji conversion dictionary index storage unit, Index information integration unit that integrates index information in the Japanese analysis dictionary index storage unit, Kana-Kanji conversion dictionary unit index, Japanese analysis dictionary And an analysis information storage unit that stores analysis at the time of translation, kana-kanji conversion, and Japanese analysis, and a display unit that displays a Japanese analysis result. To reduce the amount of memory used when incorporating a new Japanese language analyzer into an already installed information device and using the kana information input to the kana-kanji conversion device for the semantic determination processing of Japanese language analysis. Can be planned.

【００６２】また、入力部により仮名文字が入力され、
仮名漢字変換部が仮名文字を漢字へ変換した仮名漢字文
字列を日本語解析部へ渡す際に、仮名漢字情報取得部が
仮名漢字辞書部から取得した語彙情報も同時に渡し、日
本語解析部が日本語解析辞書を検索する際に、仮名漢字
変換部から渡された単語の語彙情報との間に重複する情
報が存在するか否かを判断し、重複が存在する場合に
は、日本語解析情報取得部は日本語解析辞書から取得す
る情報から重複を取り除いて語彙情報の差分情報だけを
取得する構成としたものであり、仮名漢字変換装置が既
に組み込まれた情報機器装置に、新に日本語解析装置を
組み込んで、仮名漢字変換装置への入力である仮名情報
を日本語解析の語義決定処理に利用する場合に、辞書デ
ータをメモリ上に取り出す際のメモリ使用量の増加を抑
制するとともに、辞書データ取得時間を短縮できる。Further, kana characters are inputted by the input section,
When the kana-kanji conversion unit passes the kana-kanji character string converted from kana characters to kanji to the Japanese analysis unit, the kana-kanji information acquisition unit also passes the vocabulary information acquired from the kana-kanji dictionary unit, and the Japanese analysis unit When searching the Japanese analysis dictionary, it is determined whether there is any overlapping information with the vocabulary information of the word passed from the kana-kanji conversion unit. The information acquisition unit is configured to remove the duplication from the information acquired from the Japanese analysis dictionary and acquire only the difference information of the vocabulary information. Incorporating a word analyzer and using the kana information input to the kana-kanji conversion device for the semantic determination processing of Japanese analysis, while suppressing the increase in memory usage when extracting dictionary data into memory , Ending It can reduce the data acquisition time.

【００６３】さらに、入力部により仮名文字が入力さ
れ、仮名漢字変換部が仮名文字を漢字へ変換した仮名漢
字文字列を日本語解析部へ渡す際に、仮名漢字変換部が
仮名文字列を漢字に変換する場合行った解析結果をも同
時に渡し、日本語解析部はこの解析情報を用いて、仮名
漢字変換部が出力した仮名漢字混じり文を構文・意味解
析する構成としたものであり、日本語解析装置が仮名漢
字変換から渡された構文解析結果を再利用して、余分な
構文解析・意味解析を行わないため、日本語解析装置に
おける処理時間の短縮を図ることができるFurther, when the kana-kanji character is inputted by the input unit and the kana-kanji conversion unit passes the kana-kanji character string obtained by converting the kana character to kanji to the Japanese analysis unit, the kana-kanji conversion unit converts the kana character string to the kanji character. In the case of converting to Kana, the analysis result performed is also passed at the same time, and the Japanese analysis unit uses this analysis information to perform syntax and semantic analysis of the sentence mixed with Kana-Kanji output by the Kana-Kanji conversion unit. Since the word parser reuses the parsing result passed from the kana-kanji conversion and does not perform extra parsing / semantic analysis, the processing time in the Japanese parser can be reduced.

[Brief description of the drawings]

【図１】本発明の第１の実施の形態におけるデータ登録
装置及び翻訳装置の機能ブロック図FIG. 1 is a functional block diagram of a data registration device and a translation device according to a first embodiment of the present invention.

【図２】同実施の形態におけるデータ登録装置及び翻訳
装置の回路ブロック図FIG. 2 is a circuit block diagram of a data registration device and a translation device according to the embodiment;

【図３】同実施の形態における日本語解析装置の動作を
示すフローチャートFIG. 3 is a flowchart showing the operation of the Japanese language analyzer according to the embodiment;

【図４】同実施の形態における仮名漢字変換辞書インデ
ックス記憶部の様子を示した図FIG. 4 is a diagram showing a state of a kana-kanji conversion dictionary index storage unit in the embodiment.

【図５】同実施の形態における日本語解析辞書インデッ
クス記憶部の様子を示した図FIG. 5 is a diagram showing a state of a Japanese analysis dictionary index storage unit according to the embodiment;

【図６】同実施の形態における解析情報記憶部の様子を
示した図FIG. 6 is a diagram showing a state of an analysis information storage unit according to the embodiment;

【図７】同実施の形態における仮名漢字変換辞書部の様
子を示した図FIG. 7 is a diagram showing a state of a kana-kanji conversion dictionary unit in the embodiment.

【図８】同実施の形態における日本語解析辞書部の様子
を示した図FIG. 8 is a diagram showing a state of a Japanese analysis dictionary unit in the embodiment.

【図９】日本語解析結果の構文・意味構造を示した図FIG. 9 is a diagram showing the syntax and semantic structure of the Japanese analysis result

【図１０】日本語解析結果の構文・意味構造を示した図FIG. 10 is a diagram showing a syntax and a semantic structure of a Japanese analysis result.

[Explanation of symbols]

１入力部２仮名漢字変換辞書部３仮名漢字変換辞書インデックス記憶部４仮名漢字情報取得部５仮名漢字変換部６日本語解析辞書部７日本語解析辞書インデックス記憶部８日本語解析情報取得部９日本語解析部１０インデックス情報マージ部１１解析情報記憶部１２表示部１３制御部１４キーボード１５ＣＲＴ１６ＣＰＵ１７ＲＡＭ１８ＲＯＭ DESCRIPTION OF SYMBOLS 1 Input part 2 Kana-kanji conversion dictionary part 3 Kana-kanji conversion dictionary index storage part 4 Kana-kanji information acquisition part 5 Kana-kanji conversion part 6 Japanese analysis dictionary part 7 Japanese analysis dictionary index storage part 8 Japanese analysis information acquisition part 9 Japanese analysis unit 10 Index information merging unit 11 Analysis information storage unit 12 Display unit 13 Control unit 14 Keyboard 15 CRT 16 CPU 17 RAM 18 ROM

Claims

[Claims]

An input unit for inputting a kana character, a kana-kanji conversion, and an instruction to start a Japanese syntax analysis; a kana-kanji conversion dictionary unit storing a kanji corresponding to the kana character and its part of speech and syntax information; Via the kana-kanji conversion dictionary index storage unit that stores the storage location of the vocabulary information of the words stored in the kana-kanji conversion dictionary unit, and through the kana-kanji conversion dictionary index storage unit, A kana-kanji information acquisition unit to be retrieved from the kana-kanji conversion dictionary unit, and a kana-kanji conversion unit that converts an input kana character to a kanji code based on word vocabulary information obtained from the kana-kanji information acquisition unit. A Japanese analysis dictionary unit that stores reading and pronunciation for words, part of speech, grammar information, and semantic information, and a Japanese language that stores the storage locations of vocabulary information of words stored in the Japanese analysis dictionary unit A lexical dictionary index storage unit, and a Japanese analysis information acquisition unit that extracts vocabulary information of a word requested to be acquired from the Japanese analysis dictionary unit via the Japanese analysis dictionary index storage unit; and the kana kanji information. Based on the word vocabulary information obtained from the acquisition unit, a Japanese analysis unit for performing Japanese morphological analysis, syntax and semantic analysis of the kana-kanji mixed sentence obtained from the kana-kanji conversion unit, and the kana-kanji conversion dictionary index A storage unit, an index information integration unit that integrates index information in the Japanese analysis dictionary index storage unit, an index of the kana-kanji conversion dictionary unit, an index of the Japanese analysis dictionary unit, a kana-kanji conversion, and Japanese A date comprising: an analysis information storage unit for storing an analysis at the time of analysis; and a display unit for displaying a Japanese analysis result. Word analyzer.

2. A kana-kanji information acquiring unit, comprising: a kana-kanji dictionary unit for inputting kana-kanji characters through an input unit; The vocabulary information obtained from is also passed at the same time, and when the Japanese parsing unit searches the Japanese parsing dictionary part, whether there is any overlapping information between the vocabulary information of the word passed from the kana-kanji conversion part And determining if there is a duplication, the Japanese analysis information obtaining unit removes the duplication from the information obtained from the Japanese analysis dictionary unit and obtains only the difference information of the vocabulary information. Japanese analyzer described.

3. A kana-kanji conversion unit that inputs a kana character, and when the kana-kanji conversion unit passes the kana-kanji character string obtained by converting the kana character to kanji to the Japanese analysis unit, the kana-kanji conversion unit converts the kana character string into a kanji character. When converting to
2. The Japanese analysis device according to claim 1, wherein the Japanese analysis unit uses the analysis information to perform syntax and semantic analysis on the sentence mixed with the kana / kanji output by the kana / kanji conversion unit.

4. An input step for inputting a kana character, an instruction for starting kana-kanji conversion and Japanese parsing, and requesting the kana character input in the input step via a kana-kanji conversion dictionary index storage unit. The kana kanji information input step of extracting the vocabulary information of the word having the word from the kana kanji conversion dictionary unit, and the kana character input from the input step based on the word vocabulary information obtained from the kana kanji information obtaining step. The kana-kanji conversion step of converting the kana-kanji into a code, and the kana-kanji mixed character string converted by the kana-kanji conversion step, via the Japanese analysis dictionary index storage unit, the vocabulary information of the word requested to be obtained is converted to the Japanese vocabulary information. Japanese analysis information obtaining step to be taken out from the word analysis dictionary unit, vocabulary of words obtained by the Japanese analysis information obtaining step Japanese-language analysis step of syntactically and semantically analyzing a kana-kanji mixed character string converted by the kana-kanji conversion step based on the information, the kana-kanji conversion dictionary index storage unit, and the index information in the Japanese analysis dictionary index storage unit And an index information integrating step of integrating the information.

5. A kana character is inputted in an input step,
When passing the kana-kanji character string converted by the kana-kanji conversion step to the Japanese analysis step, the kana-kanji information acquisition step also passes the vocabulary information acquired from the kana-kanji dictionary part,
When the Japanese analysis step searches the Japanese analysis dictionary part, it is determined whether or not there is overlapping information with the vocabulary information of the word passed from the kana-kanji conversion step. 5. The method according to claim 4, further comprising the step of: removing duplicates from the information acquired from the Japanese analysis dictionary unit and acquiring only difference information of vocabulary information. .

6. A kana character is inputted in an input step,
When passing the kana-kanji character string converted by the kana-kanji conversion step to the Japanese analysis step, the analysis result performed when the kana-kanji conversion step converts the kana character string to kanji is also passed at the same time. Using the analysis information,
5. The Japanese analysis method according to claim 4, further comprising: a Japanese analysis step of syntactically and semantically analyzing the sentence mixed with the kana / kanji output by the kana / kanji conversion step.

7. An input step for inputting a kana character, instructing the start of kana-kanji conversion and Japanese syntax analysis, and requesting the kana character input in the input step via a kana-kanji conversion dictionary index storage unit. The kana kanji information input step of extracting the vocabulary information of the word having the word from the kana kanji conversion dictionary unit, and the kana character input from the input step based on the word vocabulary information obtained from the kana kanji information obtaining step. The kana-kanji conversion step of converting the kana-kanji into a code, and the kana-kanji mixed character string converted by the kana-kanji conversion step, via the Japanese analysis dictionary index storage unit, the vocabulary information of the word requested to be obtained is converted to the Japanese vocabulary information. Japanese analysis information obtaining step to be taken out from the word analysis dictionary unit, vocabulary of words obtained by the Japanese analysis information obtaining step Japanese-language analysis step of syntactically and semantically analyzing a kana-kanji mixed character string converted by the kana-kanji conversion step based on the information, the kana-kanji conversion dictionary index storage unit, and the index information in the Japanese analysis dictionary index storage unit And an index information integrating step of integrating the information.

8. A kana character is input in an input step,
When passing the kana-kanji character string converted by the kana-kanji conversion step to the Japanese analysis step, the kana-kanji information acquisition step also passes the vocabulary information acquired from the kana-kanji dictionary part,
When the Japanese analysis step searches the Japanese analysis dictionary part, it is determined whether or not there is overlapping information with the vocabulary information of the word passed from the kana-kanji conversion step. 5. The Japanese analysis program according to claim 4, further comprising: a Japanese analysis information obtaining step of obtaining only difference information of vocabulary information by removing duplication from information obtained from the Japanese analysis dictionary unit. Recording medium on which is recorded.

9. A kana character is input in an input step,
When passing the kana-kanji character string converted by the kana-kanji conversion step to the Japanese analysis step, the analysis result performed when the kana-kanji conversion step converts the kana character string to kanji is also passed at the same time. Using the analysis information,
5. The recording medium according to claim 4, further comprising a Japanese language analyzing step for syntactically and semantically analyzing the sentence mixed with the kana / kanji output by the kana / kanji converting step.