JP2874378B2

JP2874378B2 - Sentence analyzer

Info

Publication number: JP2874378B2
Application number: JP3085881A
Authority: JP
Inventors: 洋志山田
Original assignee: Nippon Electric Co Ltd
Current assignee: NEC Corp
Priority date: 1991-03-27
Filing date: 1991-03-27
Publication date: 1999-03-24
Anticipated expiration: 2014-03-24
Also published as: JPH04299457A

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【産業上の利用分野】本発明は自然言語によって書かれ
た文章を解析する装置に関し、特に単語の共起関係を用
いて解析を行う文章解析装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an apparatus for analyzing a sentence written in a natural language, and more particularly to an apparatus for analyzing a sentence using co-occurrence relations of words.

【０００２】[0002]

【従来の技術】近年、自然言語処理において、単語の共
起関係を用いて解析精度を上げようとする試みが多く行
われている（例えば公開特許公報昭６３−３１１４７
１）。2. Description of the Related Art In recent years, in natural language processing, many attempts have been made to improve the analysis accuracy by using co-occurrence relationships of words (for example, Japanese Patent Application Laid-Open No. 63-31147).
1).

【０００３】以下に、単語の共起関係を用いた従来の文
章解析装置の一例について説明する。[0003] An example of a conventional sentence analyzer using co-occurrence of words will be described below.

【０００４】図２は従来技術を用いた文章解析装置の一
構成例を示すブロック図である。この文章解析装置は、
解析する文章を入力する入力部１，入力部１から入力さ
れた情報を一時格納する入力バッファ２，単語の読みや
品詞などの情報を登録している単語辞書２３，単語の共
起関係を登録している共起辞書４，単語辞書２３及び共
起辞書４を用いて入力バッファ２に格納されている文章
を解析する解析手段５，解析手段５によって指定された
複数の単語間の共起関係を共起辞書４から検索する共起
検索部２６，共起関係を検索すべき単語の組及び検索結
果を格納する検索バッファ７，解析手段５による解析結
果を一時格納する出力バッファ８，出力バッファ８の解
析結果を出力する出力部９，全体の動作の制御を行う制
御部２１０により構成されている。FIG. 2 is a block diagram showing an example of the configuration of a sentence analysis apparatus using the conventional technique. This sentence analyzer is
An input unit 1 for inputting a sentence to be analyzed, an input buffer for temporarily storing information input from the input unit 1, a word dictionary 23 for registering information such as word reading and part of speech, and a co-occurrence relationship of words are registered. Analyzing means 5 for analyzing a sentence stored in the input buffer 2 using the co-occurrence dictionary 4, the word dictionary 23 and the co-occurrence dictionary 4, the co-occurrence relationship between a plurality of words specified by the analyzing means 5 Co-occurrence search unit 26 for searching for a word from co-occurrence dictionary 4, search buffer 7 for storing a set of words for which co-occurrence relation is to be searched, and search buffer 7, output buffer 8 for temporarily storing the analysis result by analysis means 5, and output buffer An output unit 9 for outputting the analysis result 8 and a control unit 210 for controlling the overall operation.

【０００５】入力部１は、キーボード，ＯＣＲ装置，音
声入力装置，ファイル装置などと接続し、解析対象の文
章を入力して入力バッファ２に格納する。[0005] The input unit 1 is connected to a keyboard, an OCR device, a voice input device, a file device, and the like, inputs a text to be analyzed, and stores it in an input buffer 2.

【０００６】解析手段５は、ＣＰＵ，メモリなどからな
り、入力した文章に対し、かな漢字変換，形態素解析，
機械翻訳，文音声変換等の処理を行う。解析の過程で単
語間の共起関係を利用する場合には、共起関係の有無を
確認したい複数の単語の検索バッファ７に書き込み、共
起検索部２６を呼ぶ。The analyzing means 5 comprises a CPU, a memory, etc., and converts kana-kanji characters, morphological analysis,
Performs processing such as machine translation and sentence-to-speech conversion. When the co-occurrence relationship between words is used in the analysis process, the co-occurrence search unit 26 is called by writing the co-occurrence relationship in the search buffer 7 for a plurality of words whose presence or absence is to be confirmed.

【０００７】出力部９は、出力バッファ８に格納されて
いる解析結果をＣＲＴ，プリンタ，ファイル装置，音声
合成装置などに出力する。The output unit 9 outputs the analysis result stored in the output buffer 8 to a CRT, a printer, a file device, a speech synthesizer, and the like.

【０００８】単語辞書２３は、メモリ，ファイル装置な
どで実現でき、単語の読み，表記，品詞などの情報を登
録している。また、解析手段が必要とする、さらに多く
の情報を登録している場合もある。図３は単語辞書２３
の例を示す概念図である。図３で、Ａが単語の読み、Ｂ
が単語の表記、Ｃが単語の品詞である。The word dictionary 23 can be realized by a memory, a file device, or the like, and registers information such as word reading, notation, part of speech, and the like. In some cases, more information required by the analysis means is registered. FIG.
It is a conceptual diagram which shows the example of. In FIG. 3, A is a word reading, B
Is the notation of the word, and C is the part of speech of the word.

【０００９】共起辞書４は、メモリ，ファイル装置など
で実現でき、共起関係のある２語以上の単語の組み合わ
せを登録している。図４は本例における共起辞書４の構
成例を示す概念図である。図で罫線で区切られた単語群
が共起関係のある単語の組み合わせを表す。Ａ，Ｂ，Ｃ
はそれぞれ単語の表記，読み，品詞を示す。The co-occurrence dictionary 4 can be realized by a memory, a file device or the like, and registers a combination of two or more words having a co-occurrence relationship. FIG. 4 is a conceptual diagram showing a configuration example of the co-occurrence dictionary 4 in this example. In the figure, a group of words separated by ruled lines represents a combination of words having a co-occurrence relationship. A, B, C
Indicates the word notation, reading, and part of speech, respectively.

【００１０】共起検索部２６は、検索バッファ７に格納
されている複数の単語からなる共起関係を共起辞書４か
ら検索し、検索結果を検索バッファ７に格納する。検索
の手法としては、二分探索や、インデックスを用意する
方法などが考えられる。しかし、共起検索部２６で具体
的にどの手法を用いるかは本発明の内容に影響を与えな
いので説明を割愛する。The co-occurrence search unit 26 searches the co-occurrence dictionary 4 for a co-occurrence relationship consisting of a plurality of words stored in the search buffer 7 and stores the search result in the search buffer 7. Examples of the search method include a binary search and a method of preparing an index. However, which method is specifically used in the co-occurrence search unit 26 does not affect the content of the present invention, and thus the description is omitted.

【００１１】以下では、従来の文章解析装置の動作を説
明する。（ａ）制御部２１０は、入力部１に文章の入力を指示す
る。（ｂ）入力部１は、文章を入力し、入力バッファ２に格
納する。（ｃ）制御部２１０は、解析手段５に文章解析の実行を
指示する。（ｄ）解析手段５は、入力バッファ２中の文章の解析を
行う。（ｅ）解析手段５は、結果を出力バッファ８に出力す
る。（ｆ）制御部２１０は、出力部９に解析結果の出力を指
示する。（ｇ）出力部９は、出力バッファ８に格納されている解
析結果を出力する。Hereinafter, the operation of the conventional sentence analyzer will be described. (A) The control unit 210 instructs the input unit 1 to input a sentence. (B) The input unit 1 inputs a sentence and stores it in the input buffer 2. (C) The control unit 210 instructs the analysis unit 5 to execute a sentence analysis. (D) The analysis means 5 analyzes the text in the input buffer 2. (E) The analysis means 5 outputs the result to the output buffer 8. (F) The control unit 210 instructs the output unit 9 to output the analysis result. (G) The output unit 9 outputs the analysis result stored in the output buffer 8.

【００１２】解析手段５による処理中に単語間の共起関
係の有無の確認が必要になると、共起検索部２６を用い
て共起辞書４の検索が行われる。図５は従来の文章解析
装置での共起辞書４の検索手順を示す流れ図である。以
下、図５に従って共起辞書４の検索手順を説明する。（ａ）解析手段５は検索バッファ７に共起関係の検索対
象である複数の単語を書き込み、共起検索部２６に共起
辞書４の検索を指示する（ステップ５１）。（ｂ）共起検索部２６は共起辞書４の検索を行い、結果
を検索バッファ７に格納する（ステップ５２）。結果
は、共起関係が共起辞書に登録されていないことを示す
値（この例では０を用いる）、または、共起関係が共起
辞書に登録されていることを示す値（この例では１を用
いる）である。（ｃ）解析手段５は、検索バッファ７に書き込まれた検
索結果を用いて処理を続行する（ステップ５３）。When it is necessary to confirm whether or not there is a co-occurrence relationship between words during the processing by the analyzing means 5, a search of the co-occurrence dictionary 4 is performed using the co-occurrence search unit 26. FIG. 5 is a flowchart showing a search procedure of the co-occurrence dictionary 4 in the conventional text analysis device. Hereinafter, the search procedure of the co-occurrence dictionary 4 will be described with reference to FIG. (A) The analysis means 5 writes a plurality of words to be searched for a co-occurrence relationship in the search buffer 7, and instructs the co-occurrence search unit 26 to search the co-occurrence dictionary 4 (step 51). (B) The co-occurrence search unit 26 searches the co-occurrence dictionary 4, and stores the result in the search buffer 7 (step 52). The result is a value indicating that the co-occurrence relation is not registered in the co-occurrence dictionary (0 is used in this example), or a value indicating that the co-occurrence relation is registered in the co-occurrence dictionary (in this example, 1). (C) The analysis means 5 continues the processing using the search result written in the search buffer 7 (step 53).

【００１３】以下に、かな漢字変換を例にして従来の文
章解析装置の動作について説明する。「くるまではこ
ぶ。」という文を入力部１から入力し、解析手段５でか
な漢字変換処理を行う場合を考える。説明に必要な場
合、単語辞書２３は図３、共起辞書４は図４の例を用い
る。「くるまではこぶ。」という文の変換結果として、・［候補１］「来るまで運ぶ。」・［候補２］「車で運ぶ。」・［候補３］「来るまでは鼓舞。」・［候補４］「車では鼓舞。」という４通りが考えられる。この中の１つに決定するた
めに、以下の順序で共起辞書の検索を行う。（１）候補１中の単語「来る」と「運ぶ」の共起関係を
調べる。（２）「来る」と「運ぶ」を検索バッファ７に書き込
み、共起検索部２６を呼ぶ（ステップ５１）。（３）共起検索部２６は、共起辞書４を検索する。「来
る」と「運ぶ」とからなる共起関係は共起辞書４に登録
されていないので検索バッファ７に０を出力する（ステ
ップ５２）。（４）解析手段５は、検索結果を記録し、次に候補２の
処理に移る（ステップ５３）。The operation of a conventional sentence analyzer will be described below with reference to Kana-Kanji conversion. Consider a case where a sentence “Kuru bu kobu.” Is input from the input unit 1 and the analysis unit 5 performs a kana-kanji conversion process. When necessary for explanation, the example of FIG. 3 is used for the word dictionary 23 and the example of FIG. As a result of the conversion of the sentence "Until it comes," ・ [Candidate 1] “Carry until coming.” • [Candidate 2] “Carry by car.” • [Candidate 3] “Inspire until coming.” ・ [Candidate 4] There are four possible ways: "Inspire by car." In order to determine one of them, the co-occurrence dictionary is searched in the following order. (1) The co-occurrence relationship between the words “come” and “carry” in candidate 1 is examined. (2) Write “coming” and “carry” in the search buffer 7 and call the co-occurrence search unit 26 (step 51). (3) The co-occurrence search unit 26 searches the co-occurrence dictionary 4. Since the co-occurrence relation "coming" and "carry" is not registered in the co-occurrence dictionary 4, "0" is output to the search buffer 7 (step 52). (4) The analysis means 5 records the search result, and then proceeds to the processing of the candidate 2 (step 53).

【００１４】（５）候補２中の単語「車」と「運ぶ」の
共起関係を調べる。(5) The co-occurrence relationship between the words "car" and "carry" in candidate 2 is examined.

【００１５】（６）「車」と「運ぶ」を検索バッファ７
に書き込み、共起検索部２６を呼ぶ（ステップ５１）。(6) Search buffer 7 for "car" and "carry"
And calls the co-occurrence search unit 26 (step 51).

【００１６】（７）共起検索部２６は、共起辞書４を検
索する。「車」と「運ぶ」とからなる共起関係が共起辞
書４に登録されているので、検索バッファ７に１を出力
する（ステップ５２）。(7) The co-occurrence search unit 26 searches the co-occurrence dictionary 4. Since the co-occurrence relationship consisting of "car" and "carry" is registered in the co-occurrence dictionary 4, 1 is output to the search buffer 7 (step 52).

【００１７】（８）同様に「来る」と「鼓舞」、「車」
と「鼓舞」について共起辞書４を検索し、共に検索結果
０を得る。候補１−４について共起辞書４の検索が終わ
ったあと、解析手段５は、ただ一つ共起関係の登録され
ていた候補２「車で運ぶ。」を変換結果として出力部９
から出力する。(8) Similarly, "coming", "inspiring", "car"
And "inspired" in the co-occurrence dictionary 4, and a search result 0 is obtained. After the search of the co-occurrence dictionary 4 is completed for the candidates 1-4, the analysis unit 5 outputs the candidate 2 “carried by car” registered as the only co-occurrence relation as the conversion result to the output unit 9.
Output from

【００１８】以上のように、従来の文章解析装置では、
ここで挙げた例文をかな漢字変換する過程で共起辞書４
の検索が４回行われる。As described above, in the conventional sentence analyzer,
The co-occurrence dictionary 4 in the process of converting the example sentences given here to Kana-Kanji
Is searched four times.

【００１９】[0019]

【発明が解決しようとする課題】共起関係を用いた自然
言語処理では、共起辞書の検索が頻繁に行われる。一
方、共起関係の登録されている単語の組み合わせの数
は、単語同士の単純な組み合わせの総数に比較すると非
常に少ないのが普通である。そのため、共起辞書を検索
しても共起関係が登録されていない場合が多くなる。In natural language processing using co-occurrence relations, co-occurrence dictionaries are frequently searched. On the other hand, the number of word combinations registered in co-occurrence relations is usually very small compared to the total number of simple combinations of words. Therefore, the co-occurrence relation is often not registered even if the co-occurrence dictionary is searched.

【００２０】上述した従来の文章解析装置では、実際に
共起辞書の検索を行わなければ共起辞書に共起関係が登
録されていないことが分からない。そのため、どのよう
な単語の間の共起関係を検索する場合でも常に共起辞書
の検索を行い、結果として検索時間が増大し、ひいては
解析時間が増大するという欠点がある。In the above-mentioned conventional sentence analyzing apparatus, it is difficult to know that the co-occurrence relation is not registered in the co-occurrence dictionary unless the co-occurrence dictionary is actually searched. Therefore, even when searching for a co-occurrence relationship between any words, the co-occurrence dictionary is always searched, and as a result, there is a disadvantage that the search time increases and the analysis time increases.

【００２１】本発明の目的は、このような欠点を解消し
た文章解析装置を提供することにある。[0021] An object of the present invention is to provide a text analyzing apparatus which has solved such disadvantages.

【００２２】[0022]

【課題を解決するための手段】第１の発明は、解析対象
の文章を入力する入力部と、単語の読み，表記，品詞を
ファイル装置あるいはメモリ上に登録している単語辞書
と、共起関係にある複数の単語の組み合わせをファイル
装置あるいはメモリ上に登録している共起辞書と、前記
共起辞書を検索する共起検索部と、前記単語辞書と前記
共起辞書を用いて前記入力部から入力した文章の解析を
行う解析手段と、解析結果を出力する出力部とを備える
文章解析装置において、前記単語辞書は登録されている
各単語についてその単語と他の単語との間の共起関係が
前記共起辞書に登録されているかどうかを示す判別情報
を持ち、前記判別情報を参照して共起関係が前記共起辞
書に登録されていない単語の組み合わせを判断する判別
部を有し、前記共起検索部は前記判別部の判断に応じて
前記共起辞書の検索を省略することを特徴とする。According to a first aspect of the present invention, there is provided an input unit for inputting a sentence to be analyzed, a word dictionary in which word reading, notation, and part of speech are registered in a file device or a memory. A co-occurrence dictionary in which a combination of a plurality of related words is registered in a file device or a memory, a co-occurrence search unit that searches the co-occurrence dictionary, and the input using the word dictionary and the co-occurrence dictionary In a sentence analyzing apparatus including an analysis unit for analyzing a sentence input from a unit and an output unit for outputting an analysis result, the word dictionary stores, for each registered word, a common word between the registered word and another word. A determination unit that has determination information indicating whether or not the co-occurrence relation is registered in the co-occurrence dictionary and determines a combination of words whose co-occurrence relation is not registered in the co-occurrence dictionary with reference to the discrimination information; And Searching unit is characterized by omitting the search of the cooccurrence dictionary in response to a determination of the determination unit.

【００２３】第２の発明は、解析対象の文章を入力する
入力部と、単語の読み，表記，品詞をファイル装置ある
いはメモリ上に登録している単語辞書と、共起関係にあ
る複数の単語の組み合わせをファイル装置あるいはメモ
リ上に登録している共起辞書と、前記共起辞書を検索す
る共起検索部と、前記単語辞書と前記共起辞書を用いて
前記入力部から入力した文章の解析を行う解析手段と、
解析結果を出力する出力部とを備える文章解析装置にお
いて、前記共起辞書は共起関係にある複数の単語の組み
合わせを単語に順序をつけて登録し、前記単語辞書は登
録されている各単語についてその単語と他の単語との間
の共起関係が前記共起辞書に登録されているかどうかを
示し、登録されている場合には何番目の単語として登録
されているかを示す判別情報を持ち、前記判別情報を参
照して共起関係が前記共起辞書に登録されていない単語
の組み合わせを判断する判別部を有し、前記共起検索部
は前記判別部の判断に応じて前記共起辞書の検索を省略
することを特徴とする。According to a second aspect of the present invention, a plurality of words having a co-occurrence relationship with an input unit for inputting a sentence to be analyzed and a word dictionary in which reading, writing, and part of speech of words are registered in a file device or a memory. A co-occurrence dictionary that has registered a combination of the same on a file device or a memory, a co-occurrence search unit that searches the co-occurrence dictionary, and a sentence input from the input unit using the word dictionary and the co-occurrence dictionary. Analysis means for performing analysis;
An output unit that outputs an analysis result, wherein the co-occurrence dictionary registers a combination of a plurality of words having a co-occurrence relationship with words in order, and the word dictionary stores each registered word. Indicates whether a co-occurrence relationship between the word and another word is registered in the co-occurrence dictionary, and, if registered, has identification information indicating the number of the word registered. A co-occurrence search unit that determines a combination of words whose co-occurrence relation is not registered in the co-occurrence dictionary with reference to the discrimination information; It is characterized in that a dictionary search is omitted.

【００２４】第３の発明は、解析対象の文章を入力する
入力部と、単語の読み，表記，品詞をファイル装置ある
いはメモリ上に登録している単語辞書と、共起関係にあ
る複数の単語の組み合わせをファイル装置あるいはメモ
リ上に登録している共起辞書と、前記共起辞書を検索す
る共起検索部と、前記単語辞書と前記共起辞書を用いて
前記入力部から入力した文章の解析を行う解析手段と、
解析結果を出力する出力部とを備える文章解析装置にお
いて、前記単語辞書は各単語について前記共起辞書に共
起関係の登録されている単語同士では同じ値になるよう
に設定された判別情報を持ち、前記判別情報を参照し
て、共起関係が前記共起辞書に登録されていない単語の
組み合わせを判断する判別部を有し、前記共起検索部は
前記判別部の判断に応じて前記共起辞書の検索を省略す
ることを特徴とする。According to a third aspect of the present invention, a plurality of words having a co-occurrence relationship with an input unit for inputting a sentence to be analyzed, a word dictionary in which reading, writing, and part of speech of words are registered in a file device or a memory. A co-occurrence dictionary that has registered a combination of the same on a file device or a memory, a co-occurrence search unit that searches the co-occurrence dictionary, and a sentence input from the input unit using the word dictionary and the co-occurrence dictionary. Analysis means for performing analysis;
And an output unit that outputs an analysis result, wherein the word dictionary outputs discrimination information set to have the same value between words registered in a co-occurrence relationship in the co-occurrence dictionary for each word. A co-occurrence relationship, the co-occurrence search unit having a discriminating unit for judging a combination of words that are not registered in the co-occurrence dictionary with reference to the discriminating information. It is characterized in that the search of the co-occurrence dictionary is omitted.

【００２５】[0025]

【作用】本発明によれば、単語間の共起関係が共起辞書
に登録されているかどうかを見分けるための判別情報が
単語辞書の各単語に与えられ、該判別情報を用いて共起
関係の登録されていない単語の組み合わせを判別し、共
起関係が登録されていない場合は共起辞書検索を省略す
ることにより、解析時間を短縮できる。According to the present invention, discrimination information for discriminating whether or not the co-occurrence relation between words is registered in the co-occurrence dictionary is given to each word in the word dictionary, and the co-occurrence relation is determined using the discrimination information. By determining the combination of words that are not registered, and if the co-occurrence relation is not registered, the analysis time can be reduced by omitting the co-occurrence dictionary search.

【００２６】第１の発明では、共起辞書に共起関係が登
録されているかどうかを示す情報を判別情報として単語
辞書に登録する。共起関係の検索対象である単語群中に
共起関係の登録されていない単語があれば、共起辞書の
検索を省略することにより、検索時間を短縮できる。In the first invention, information indicating whether or not a co-occurrence relation is registered in the co-occurrence dictionary is registered in the word dictionary as discrimination information. If there is a word whose co-occurrence relation is not registered in the group of words to be searched for the co-occurrence relation, the search time can be reduced by omitting the search of the co-occurrence dictionary.

【００２７】第２の発明では、共起辞書に登録されてい
る共起関係中での出現位置を判別情報として単語辞書に
登録する。共起関係の検索対象である単語群中に、単語
群中の出現位置での共起関係が登録されていない単語が
あれば、共起辞書の検索を省略することにより、検索時
間を短縮できる。In the second invention, the appearance position in the co-occurrence relation registered in the co-occurrence dictionary is registered as discrimination information in the word dictionary. If there is a word for which the co-occurrence relation at the occurrence position in the word group is not registered in the word group to be searched for the co-occurrence relation, the search time can be reduced by omitting the search of the co-occurrence dictionary. .

【００２８】第３の発明では、共起辞書に共起関係が登
録されている単語同士では値が等しくなるように与えた
数値を判別情報として単語辞書に登録する。共起関係の
検索対象である単語群の判別情報を比較し、すべてが等
しい場合以外は、共起辞書の検索を省略することによ
り、検索時間を短縮できる。In the third aspect of the present invention, a numerical value given so that the values of words whose co-occurrence relations are registered in the co-occurrence dictionary are equal to each other is registered in the word dictionary as discrimination information. By comparing the discrimination information of the word groups to be searched for the co-occurrence relation and omitting the search of the co-occurrence dictionary unless all are equal, the search time can be reduced.

【００２９】[0029]

【実施例】以下では、第１の発明の実施例について図面
を参照しながら説明する。Embodiments of the first invention will be described below with reference to the drawings.

【００３０】図１は本発明の文章解析装置の一実施例を
示すブロック図である。この文章解析装置は、解析する
文章を入力する入力部１，入力部１から入力された情報
を一時格納する入力バッファ２，単語の読みや品詞など
の情報を登録している単語辞書３，単語の共起関係を登
録している共起辞書４，単語辞書３及び共起辞書４を用
いて入力バッファ２に格納されている文章を解析する解
析手段５，解析手段５によって指定された複数の単語間
の共起関係を共起辞書４から検索する共起検索部６，検
索すべき単語の情報及び検索結果を格納する検索バッフ
ァ７，解析手段５による解析結果を一時格納する出力バ
ッファ８，出力バッファ８の解析結果を出力する出力部
９，単語の共起情報が共起辞書４に登録されていないこ
とを示す値を記憶する定数記憶部１０，検索バッファ７
に格納されている単語の判別情報を定数記憶部１０が記
憶する値と比較する比較部１１，定数記憶部１０及び比
較部１１を用いて共起辞書４の検索を行うかどうかを判
別する判別部１２，全体の動作の制御を行う制御部１４
とを備えている。定数記憶部１０と比較部１１と判別部
１２は判別手段１３を構成している。図１において、図
２と同じ番号のものは同じものを示しているので説明を
省略する。FIG. 1 is a block diagram showing an embodiment of a text analysis apparatus according to the present invention. The sentence analysis apparatus includes an input unit 1 for inputting a sentence to be analyzed, an input buffer for temporarily storing information input from the input unit 1, a word dictionary 3 for registering information such as word reading and part of speech, and a word dictionary. The co-occurrence dictionary 4, the word dictionary 3, and the co-occurrence dictionary 4 which have registered the co-occurrence relation of A co-occurrence search unit 6 for searching for a co-occurrence relationship between words from the co-occurrence dictionary 4, a search buffer 7 for storing information on a word to be searched and a search result, an output buffer 8 for temporarily storing an analysis result by the analysis unit 5, An output unit 9 for outputting an analysis result of the output buffer 8; a constant storage unit 10 for storing a value indicating that word co-occurrence information is not registered in the co-occurrence dictionary 4; a search buffer 7
A comparison unit 11 that compares the determination information of the word stored in the storage unit 10 with the value stored in the constant storage unit 10, and determines whether to search the co-occurrence dictionary 4 using the constant storage unit 10 and the comparison unit 11. Unit 12, a control unit 14 for controlling the overall operation
And The constant storage unit 10, the comparison unit 11, and the determination unit 12 constitute a determination unit 13. In FIG. 1, those having the same numbers as those in FIG.

【００３１】以下では、本実施例の動作を説明する。（ａ）制御部１４は、入力部１に文章の入力を指示す
る。（ｂ）入力部１は、文章を入力し、入力バッファ２に格
納する。（ｃ）制御部１４は、解析手段５に文章解析の実行を指
示する。（ｄ）解析手段５によって入力バッファ２中の文章の解
析を行う。（ｅ）解析が終わると解析手段５は、結果を出力バッフ
ァ８に出力する。（ｆ）制御部１４は、出力部９に解析結果の出力を指示
する。（ｇ）出力部９は、出力バッファ８に格納されている解
析結果を出力する。Hereinafter, the operation of the present embodiment will be described. (A) The control unit 14 instructs the input unit 1 to input a sentence. (B) The input unit 1 inputs a sentence and stores it in the input buffer 2. (C) The control unit 14 instructs the analysis unit 5 to execute a sentence analysis. (D) The analysis means 5 analyzes the text in the input buffer 2. (E) When the analysis is completed, the analysis means 5 outputs the result to the output buffer 8. (F) The control unit 14 instructs the output unit 9 to output the analysis result. (G) The output unit 9 outputs the analysis result stored in the output buffer 8.

【００３２】単語辞書３は、従来技術で用いられている
情報に加えて、それぞれの単語の共起関係が共起辞書４
に登録されているかどうかを示す判別情報が付け加えら
れている。判別情報は共起関係が登録されていることを
示す値（以下の説明では例として１を用いる）または共
起関係が登録されていないことを示す値（以下の説明で
は例として０を用いる）のいずれかである。図６は、本
実施例の単語辞書３の構成例を示す概念図である。図６
で、Ａが単語の表記、Ｂが単語の読み、Ｃが単語の品
詞、Ｄが判別情報である。The word dictionary 3 includes, in addition to the information used in the prior art, the co-occurrence relation of each word.
The identification information indicating whether or not the information is registered is added. The discrimination information is a value indicating that a co-occurrence relationship is registered (1 is used as an example in the following description) or a value indicating that a co-occurrence relationship is not registered (0 is used as an example in the following description). Is one of FIG. 6 is a conceptual diagram illustrating a configuration example of the word dictionary 3 of the present embodiment. FIG.
A is a word notation, B is a word reading, C is a word class, and D is discrimination information.

【００３３】判別手段１３は、検索バッファ７から単語
の判別情報を読み込み、比較部１１によって判別情報を
調べ、与えられた単語のなかに共起関係が全く登録され
ていない単語があるかどうかを判断する。そして、判別
手段１３は、共起関係が全く登録されていない単語があ
ることを示す値（以下の説明では例として０を用い
る）、または、共起関係が全く登録されていない単語は
ないことを示す値（以下の説明では例として１を用い
る）のいずれかを共起検索部６に返す。The discriminating means 13 reads the discriminating information of the word from the search buffer 7, checks the discriminating information by the comparing unit 11, and determines whether or not there is a word for which no co-occurrence relation is registered among the given words. to decide. Then, the determination unit 13 determines that there is a word indicating that no co-occurrence relation is registered (0 is used as an example in the following description), or that there is no word indicating that no co-occurrence relation is registered. (In the following description, 1 is used as an example) to the co-occurrence search unit 6.

【００３４】定数記憶部１０は、単語辞書３で用いられ
ている、共起関係が登録されていることを示す判別情報
の値を記憶している。The constant storage unit 10 stores the value of discrimination information used in the word dictionary 3 and indicating that a co-occurrence relationship is registered.

【００３５】比較部１１は、単語の判別情報を定数記憶
部１０に記憶されている値と比較する。The comparing section 11 compares the discrimination information of the word with the value stored in the constant storage section 10.

【００３６】共起検索部６は、共起辞書４の検索を行う
前に判別手段１３を呼び、判別手段１３によって共起関
係が共起辞書４に登録されていないことが分かった場合
には共起辞書４の検索を省略する。図７は本実施例にお
ける共起辞書４の検索手順を示す流れ図である。以下、
図７に従って説明する。（ａ）解析手段５は、検索バッファ７に共起関係の検索
対象である複数の単語を書き込み、共起検索部６に共起
辞書４の検索を指示する（ステップ７１）。（ｂ）共起検索部６は、判別手段１３によって共起辞書
４の検索が必要かどうかの判別を行う（ステップ７
２）。（ｃ）判別手段１３が０（共起関係が全く登録されてい
ない単語が検索バッファ７中にあることを示す値）を返
したなら、共起検索部６は検索バッファ７に０（共起関
係が共起辞書４に登録されていないことを表す値）を書
き込み検索を終了する（ステップ７３）。（ｄ）判別手段１３が１（共起関係が全く登録されてい
ない単語は検索バッファ７中にないことを示す値）を返
したなら、共起検索部６は共起辞書４の検索を行い、結
果を検索バッファ７に書き込み検索を終了する（ステッ
プ７４）。（ｅ）解析手段５は、検索バッファ７に書き込まれた検
索結果を用いて処理を続行する（ステップ７５）。The co-occurrence search unit 6 calls the discriminating means 13 before searching the co-occurrence dictionary 4. If the discriminating means 13 finds that the co-occurrence relation is not registered in the co-occurrence dictionary 4, The search of the co-occurrence dictionary 4 is omitted. FIG. 7 is a flowchart showing a search procedure of the co-occurrence dictionary 4 in this embodiment. Less than,
This will be described with reference to FIG. (A) The analysis unit 5 writes a plurality of words to be searched for the co-occurrence relationship in the search buffer 7 and instructs the co-occurrence search unit 6 to search the co-occurrence dictionary 4 (step 71). (B) The co-occurrence search unit 6 determines whether the search of the co-occurrence dictionary 4 is necessary by the determination unit 13 (step 7).
2). (C) If the determination unit 13 returns 0 (a value indicating that a word for which no co-occurrence relation is registered in the search buffer 7), the co-occurrence search unit 6 stores 0 (co-occurrence) in the search buffer 7. A value indicating that the relationship is not registered in the co-occurrence dictionary 4) is written, and the search is terminated (step 73). (D) If the discriminating means 13 returns 1 (a value indicating that there is no word in the search buffer 7 for which no co-occurrence relation is registered), the co-occurrence search unit 6 searches the co-occurrence dictionary 4 Then, the result is written into the search buffer 7 and the search is terminated (step 74). (E) The analysis means 5 continues the process using the search result written in the search buffer 7 (step 75).

【００３７】図８は本実施例における判別手段１３の動
作の一例を示す流れ図である。以下、図８に従って判別
手段１３の動作を説明する。説明にあたり、検索バッフ
ァ７中の単語数をｎ、ｉ番目の単語をＷ_i、単語Ｗ_iの判
別情報をＷ_i・ｆと表す。（ａ）ｉ←１（ステップ８１）（ｂ）比較部１１によってＷ_i・ｆと定数記憶部１０に
記憶されている値とを比較する（ステップ８２）。（ｃ）Ｗ_i・ｆが定数記憶部１０に記憶されている値と
等しくないならば（ｇ）（ステップ８３）。（ｄ）ｉ←ｉ＋１（ステップ８４）（ｅ）ｉ≦ｎなら（ｂ）へ戻る（ステップ８５）。（ｆ）共起検索部６に１を返し、終了する（ステップ８
６）。（ｇ）共起検索部６に０を返し、終了する（ステップ８
７）。FIG. 8 is a flowchart showing an example of the operation of the determining means 13 in the present embodiment. Hereinafter, the operation of the determination means 13 will be described with reference to FIG. Description Upon represents the number of words in the search buffer 7 n, i-th word W _i, the determination information of a word W _i and W _i · f. (A) i ← 1 (step 81) (b) The comparison unit 11 compares _Wi · f with the value stored in the constant storage unit 10 (step 82). (C) If _Wi · f is not equal to the value stored in the constant storage unit 10 (g) (step 83). (D) i ← i + 1 (step 84) (e) If i ≦ n, return to (b) (step 85). (F) Return 1 to the co-occurrence search unit 6 and end (Step 8)
6). (G) Return 0 to the co-occurrence search unit 6 and end (Step 8)
7).

【００３８】以下に、かな漢字変換を例にして本実施例
の動作について説明する。「くるまではこぶ。」という
文を入力部１から入力し、解析手段５でかな漢字変換処
理を行う場合を考える。説明に必要な場合、単語辞書３
は図６、共起辞書４は図４の例を用いる。従来の文章解
析装置と同様に「くるまではこぶ。」という文の変換結
果として、・［候補１］「来るまで運ぶ。」・［候補２］「車で運ぶ。」・［候補３］「来るまでは鼓舞。」・［候補４］「車では鼓舞。」という４通りが考えられる。この中の１つに決定するた
めに、以下の順序で共起辞書の検索を行う。（１）候補１中の単語「来る」と「運ぶ」の共起関係を
調べる。（２）「来る」と「運ぶ」を検索バッファ７に書き込
み、共起検索部６を呼ぶ（ステップ７１）。（３）共起検索部６は、判別手段１３を呼ぶ（ステップ
７２）。（４）判別手段１３は、「来る」の判別情報を調べる
（ステップ８２）。（５）「来る」の判別情報は１で定数記憶部１０に記憶
されている値と等しいので、判別手段１３は次の単語の
判別情報を調べる（ステップ８５）。（６）判別手段１３は、「運ぶ」の判別情報を調べる
（ステップ８２）。（７）「運ぶ」の判別情報は１で、かつ「運ぶ」が最後
の単語なので、判別手段１３は共起検索部６に１を返す
（ステップ８５，８６）。（８）共起検索部６は、共起辞書４を検索する。「車」
と「運ぶ」とからなる共起関係が共起辞書４に登録され
ていないので検索バッファ７に０を出力する（ステップ
７４）。（９）解析手段５は、検索結果を記録し、次に候補２中
の単語「車」と「運ぶ」の共起関係を調べる。（１０）「車」と「運ぶ」を検索バッファ７に書き込
み、共起検索部６を呼ぶ（ステップ７１）。（１１）共起検索部６は、判別手段１３を呼ぶ（ステッ
プ７２）。（１２）判別手段１３は、「車」の判別情報を調べる
（ステップ８２）。（１３）「車」の判別情報は１なので、判別手段１３は
次の単語の判別情報を調べる（ステップ８５）。（１４）判別手段１３は、「運ぶ」の判別情報を調べる
（ステップ８２）。（１５）「運ぶ」の判別情報は１で、かつ「運ぶ」が最
後の単語なので、判別手段１３は共起検索部６に１を返
す（ステップ８５，８６）。（１６）共起検索部６は、共起辞書４を検索する。
「車」と「運ぶ」とからなる共起関係が共起辞書４に登
録されているので検索バッファ７に１を出力する（ステ
ップ７４）。（１７）解析手段５は、検索結果を記録し、次に候補３
中の単語「来る」と「鼓舞」の共起関係を調べる。（１８）「来る」と「鼓舞」を検索バッファ７に書き込
み、共起検索部６を呼ぶ（ステップ７１）。（１９）共起検索部６は、判別手段１３を呼ぶ（ステッ
プ７２）。（２０）判別手段１３は、「来る」の判別情報を調べる
（ステップ８２）。（２１）「来る」の判別情報は１なので、判別手段１３
は次の単語の判別情報を調べる（ステップ８５）。（２２）判別手段１３は、「鼓舞」の判別情報を調べる
（ステップ８２）。（２３）「鼓舞」の判別情報は０なので、判別手段１３
は共起検索部６に０を返す（ステップ８７）。（２４）共起検索部６は、検索バッファ７に０を書き込
み検索を終了する（ステップ７３）。（２５）同様に候補４中の単語「車」と「鼓舞」につい
て判別情報を調べ、判別手段１３から０を得て、検索結
果は０となる。候補１−４について共起検索部６による
処理が終わったあと、解析手段５は、共起関係の登録さ
れていた候補２「車で運ぶ。」を変換結果として出力部
から出力する。The operation of the present embodiment will be described below with reference to kana-kanji conversion. Consider a case where a sentence “Kuru bu kobu.” Is input from the input unit 1 and the analysis unit 5 performs a kana-kanji conversion process. Word dictionary 3 if necessary for explanation
6 and the co-occurrence dictionary 4 uses the example of FIG. As in the case of the conventional sentence analyzer, the conversion result of the sentence “Hold up to come” is as follows: [Candidate 1] “Carry until coming.” • [Candidate 2] “Carry by car.” • [Candidate 3] “Come. Until it is inspired. "・ [Candidate 4] There are four possible ways:" Inspire by car. " In order to determine one of them, the co-occurrence dictionary is searched in the following order. (1) The co-occurrence relationship between the words “come” and “carry” in candidate 1 is examined. (2) Write "coming" and "carry" in the search buffer 7, and call the co-occurrence search unit 6 (step 71). (3) The co-occurrence search unit 6 calls the determination unit 13 (Step 72). (4) The discriminating means 13 checks the "coming" discriminating information (step 82). (5) Since the determination information of "coming" is 1 and is equal to the value stored in the constant storage unit 10, the determination unit 13 checks the determination information of the next word (step 85). (6) The discriminating means 13 checks discrimination information of "carry" (step 82). (7) The determination information of "carry" is 1 and "carry" is the last word, so the determination means 13 returns 1 to the co-occurrence search unit 6 (steps 85 and 86). (8) The co-occurrence search unit 6 searches the co-occurrence dictionary 4. "car"
Since the co-occurrence relationship consisting of and "carry" is not registered in the co-occurrence dictionary 4, 0 is output to the search buffer 7 (step 74). (9) The analysis unit 5 records the search result, and then checks the co-occurrence relationship between the words “car” and “carry” in the candidate 2. (10) Write "car" and "carry" in the search buffer 7, and call the co-occurrence search unit 6 (step 71). (11) The co-occurrence search unit 6 calls the determination unit 13 (Step 72). (12) The discriminating means 13 checks the discriminating information of "car" (step 82). (13) Since the discrimination information of "car" is 1, the discriminating means 13 checks the discrimination information of the next word (step 85). (14) The determination means 13 checks the determination information of "carry" (step 82). (15) The determination information of "carry" is 1 and "carry" is the last word, so the determination means 13 returns 1 to the co-occurrence search unit 6 (steps 85 and 86). (16) The co-occurrence search unit 6 searches the co-occurrence dictionary 4.
Since the co-occurrence relationship consisting of "car" and "carry" is registered in the co-occurrence dictionary 4, 1 is output to the search buffer 7 (step 74). (17) The analysis means 5 records the search result, and then records the candidate 3
Investigate the co-occurrence relationship between the words "coming" and "inspiring". (18) Write “come” and “inspire” in the search buffer 7 and call the co-occurrence search unit 6 (step 71). (19) The co-occurrence search unit 6 calls the determination unit 13 (Step 72). (20) The determination means 13 checks the determination information of "coming" (step 82). (21) Since the determination information of “coming” is 1, the determination unit 13
Checks the discriminating information of the next word (step 85). (22) The discriminating means 13 checks discrimination information of "inspire" (step 82). (23) Since the discrimination information of “inspire” is 0, the discrimination means 13
Returns 0 to the co-occurrence search unit 6 (step 87). (24) The co-occurrence search unit 6 writes 0 in the search buffer 7 and ends the search (step 73). (25) Similarly, the discrimination information is checked for the words “car” and “inspire” in the candidate 4, 0 is obtained from the discriminating means 13, and the search result is 0. After the processing by the co-occurrence search unit 6 is completed for the candidates 1-4, the analysis unit 5 outputs from the output unit the candidate 2 “carried by car” registered in the co-occurrence relation as a conversion result.

【００３９】以上のように、本発明による文章解析装置
では、ここで挙げた例文をかな漢字変換する過程で共起
辞書４の検索が２回しか行われない。As described above, in the sentence analysis apparatus according to the present invention, the co-occurrence dictionary 4 is searched only twice in the process of converting the example sentence described above into kana-kanji characters.

【００４０】以下では、第２の発明の実施例について図
面を参照しながら説明する。Hereinafter, an embodiment of the second invention will be described with reference to the drawings.

【００４１】図９は本発明の文章解析装置の一実施例を
示すブロック図である。この文章解析装置は、解析する
文章を入力する入力部１，入力部１から入力された情報
を一時格納する入力バッファ２，単語の読みや品詞など
の情報を登録している単語辞書９３，単語の共起関係を
登録している共起辞書４，単語辞書９３及び共起辞書４
を用いて入力バッファ２に格納されている文章を解析す
る解析手段５，解析手段５によって指定された複数の単
語間の共起関係を共起辞書４から検索する共起検索部９
６，検索すべき単語の情報及び検索結果を格納する検索
バッファ７，解析手段５による解析結果を一時格納する
出力バッファ８，出力バッファ８の解析結果を出力する
出力部９，単語の共起情報が共起辞書４に登録されてい
ないことを示す値を記憶する定数記憶部１０，検索バッ
ファ７に格納されている単語の判別情報から指定された
桁を抜き出す抽出部９１１，抽出部９１１によって抜き
出された値を定数記憶部１０が記憶する値と比較する比
較部９１２，判別情報及び判定に使う桁の指定を抽出部
９１１に与えて検索を行うかどうかを判別する判別部９
１３，全体の動作の制御を行う制御部９１５とを備えて
いる。定数記憶部１０と抽出部９１１と比較部９１２と
判別部９１３とは、判別手段９１４を構成している。図
９において、図１と同じ番号のものは同じものを示して
いるので説明を省略する。FIG. 9 is a block diagram showing an embodiment of the text analyzing apparatus according to the present invention. The sentence analyzing apparatus includes an input unit 1 for inputting a sentence to be analyzed, an input buffer for temporarily storing information input from the input unit 1, a word dictionary 93 for registering information such as word reading and part of speech, a word dictionary 93, Co-occurrence dictionary 4, word dictionary 93, and co-occurrence dictionary 4
And a co-occurrence search unit 9 for searching the co-occurrence dictionary 4 for a co-occurrence relationship between a plurality of words specified by the analysis means 5 by analyzing the sentence stored in the input buffer 2 by using
6, a search buffer 7 for storing information on a word to be searched and a search result, an output buffer 8 for temporarily storing an analysis result by the analysis means 5, an output unit 9 for outputting an analysis result of the output buffer 8, word co-occurrence information Is stored in a constant storage unit 10 that stores a value indicating that the word is not registered in the co-occurrence dictionary 4, an extraction unit 911 that extracts a designated digit from word identification information stored in a search buffer 7, and an extraction unit 911 that extracts a digit. A comparing unit 912 that compares the output value with a value stored in the constant storage unit 10, a discriminating unit 9 that determines whether or not to perform a search by providing the extracting unit 911 with discrimination information and designation of a digit to be used for the discrimination.
13, a control unit 915 for controlling the entire operation. The constant storage unit 10, the extracting unit 911, the comparing unit 912, and the determining unit 913 constitute a determining unit 914. In FIG. 9, those having the same numbers as those in FIG.

【００４２】以下では、本実施例の動作を説明する。（ａ）制御部９１５は、入力部１に文章の入力を指示す
る。（ｂ）入力部１は、文章を入力し、入力バッファ２に格
納する。（ｃ）制御部９１５は、解析手段５に文章解析の実行を
指示する。（ｄ）解析手段５によって入力バッファ２中の文章の解
析を行う。（ｅ）解析が終わると解析手段５は、結果を出力バッフ
ァ８に出力する。（ｆ）制御部９１５は、出力部９に解析結果の出力を指
示する。（ｇ）出力部９は、出力バッファ８に格納されている解
析結果を出力する。Hereinafter, the operation of this embodiment will be described. (A) The control unit 915 instructs the input unit 1 to input a sentence. (B) The input unit 1 inputs a sentence and stores it in the input buffer 2. (C) The control unit 915 instructs the analysis unit 5 to execute a sentence analysis. (D) The analysis means 5 analyzes the text in the input buffer 2. (E) When the analysis is completed, the analysis means 5 outputs the result to the output buffer 8. (F) The control unit 915 instructs the output unit 9 to output the analysis result. (G) The output unit 9 outputs the analysis result stored in the output buffer 8.

【００４３】図１０は、本実施例の単語辞書９３の構成
例を示す概念図である。図１０で、Ａが単語の表記、Ｂ
が単語の読み、Ｃが単語の品詞、Ｄが判別情報である。
判別情報はｎ桁の２進数で表される（図１０ではｎ＝３
としている）。ある単語と他の単語との共起関係が共起
辞書４に登録してあり、その共起関係のｉ番目の単語で
あるときｉ桁目が特定の値（本実施例では１とする）に
なり、その単語をｉ番目の単語とする共起関係が登録さ
れていなければ別の特定の値（本実施例では０とする）
になる。共起関係が複数登録されている場合は、それぞ
れの共起関係中での順序に応じて複数の桁が１になる。
共起関係が一つも登録されていない場合は、すべての桁
が０になる。例えば図１０の単語「餌」は図４の
「餌」，「運ぶ」という共起関係の１番目の単語であ
り、「動物」，「餌」，「与える」という共起関係の２
番目の単語であるので、１桁目と２桁目が１になってい
る。FIG. 10 is a conceptual diagram showing an example of the configuration of the word dictionary 93 of this embodiment. In FIG. 10, A is a word notation, B
Is the word reading, C is the part of speech of the word, and D is the discrimination information.
The discrimination information is represented by an n-digit binary number (n = 3 in FIG. 10).
And). The co-occurrence relation between a certain word and another word is registered in the co-occurrence dictionary 4, and when the co-occurrence relation is the i-th word, the i-th digit has a specific value (1 in this embodiment). And if a co-occurrence relation in which the word is the i-th word is not registered, another specific value (0 in this embodiment)
become. When a plurality of co-occurrence relationships are registered, a plurality of digits becomes 1 according to the order in each co-occurrence relationship.
If no co-occurrence relationship is registered, all digits are 0. For example, the word “bait” in FIG. 10 is the first word in the co-occurrence relationship of “bait” and “carry” in FIG.
Since it is the second word, the first and second digits are 1.

【００４４】定数記憶部１０は、単語辞書９３で用いら
れている、共起関係が登録されていることを示す判別情
報の値を記憶している。The constant storage unit 10 stores the value of discrimination information used in the word dictionary 93 and indicating that a co-occurrence relationship is registered.

【００４５】判別手段９１４は、検索バッファ７から単
語の判別情報を読み込み、ｉ番目の単語の判別情報のｉ
桁目を抽出部９１１によって抜き出す。比較部９１２
は、抜き出された値を定数記憶部１０に記憶されている
値と比較する。そして、判別手段９１４は、検索バッフ
ァ７中の単語からなる共起関係が共起辞書４に登録され
ていないことを示す値（以下の説明では例として０を用
いる）、または、共起辞書４を検索する必要があること
を示す値（以下の説明では例として１を用いる）のいず
れかを共起検索部９６に返す。The discriminating means 914 reads the discrimination information of the word from the search buffer 7, and reads the i of the discrimination information of the ith word.
The digit is extracted by the extraction unit 911. Comparison section 912
Compares the extracted value with the value stored in the constant storage unit 10. Then, the determination unit 914 sets a value indicating that the co-occurrence relation composed of words in the search buffer 7 is not registered in the co-occurrence dictionary 4 (0 is used as an example in the following description), or Is returned to the co-occurrence search unit 96, indicating that it is necessary to search for (1).

【００４６】図１１は本実施例における判別手段９１４
の動作の一例を示す流れ図である。以下、図１１に従っ
て判別手段９１４の動作を説明する。説明にあたり、検
索バッファ７中の単語数をｎ、ｉ番目の単語をＷ_i、単
語Ｗ_iの判別情報をＷ_i・ｆと表す。（ａ）ｉ←１（ステップ１１１）（ｂ）ｉとＷ_i・ｆを抽出部９１１にわたし、Ｗ. ｆの
ｉ桁目を抜き出す（ステップ１１２）。（ｃ）比較部９１２でＷ．ｆのｉ桁目を定数記憶部１０
に記憶されている値と比較する（ステップ１１３）。（ｄ）ｉ桁目が定数記憶部１０に記憶されている値と等
しくないならば（ｈ）へ（ステップ１１４）。（ｅ）ｉ←ｉ＋１（ステップ１１５）（ｆ）ｉ≦ｎなら（ｂ）へ戻る（ステップ１１６）。（ｇ）共起検索部６に１を返し、終了する（ステップ１
１７）。（ｈ）共起検索部６に０を返し、終了する（ステップ１
１８）。FIG. 11 shows a discriminating means 914 in this embodiment.
5 is a flowchart showing an example of the operation of the first embodiment. Hereinafter, the operation of the determination means 914 will be described with reference to FIG. Description Upon represents the number of words in the search buffer 7 n, i-th word W _i, the determination information of a word W _i and W _i · f. (A) i ← 1 (step 111) (b) I and W _i · f are sent to the extraction unit 911, and the i-th digit of W.f is extracted (step 112). (C) W.W. The i-th digit of f is stored in the constant storage unit 10
(Step 113). (D) If the i-th digit is not equal to the value stored in the constant storage unit 10, go to (h) (step 114). (E) i ← i + 1 (step 115) (f) If i ≦ n, return to (b) (step 116). (G) Return 1 to the co-occurrence search unit 6 and end (Step 1)
17). (H) Return 0 to the co-occurrence search unit 6 and end (Step 1)
18).

【００４７】以下に、かな漢字変換を例にして本実施例
の動作について説明する。「くるまではこぶ。」という
文を入力部１から入力し、解析手段５でかな漢字変換処
理を行う場合を考える。説明に必要な場合、単語辞書９
３は図１０、共起辞書４は図４の例を用いる。「くるま
ではこぶ。」という文の変換結果として、・［候補１］「来るまで運ぶ。」・［候補２］「車で運ぶ。」・［候補３］「来るまでは鼓舞。」・「候補４］「車では鼓舞。」という４通りが考えられる。この中の１つに決定するた
めに、以下の順序で共起辞書の検索を行う。（１）候補１中の単語「来る」と「運ぶ」の共起関係を
調べる。（２）「来る」と「運ぶ」を検索バッファ７に書き込
み、共起検索部６を呼ぶ（ステップ７１）。（３）共起検索部９６は、判別手段９１４を呼ぶ（ステ
ップ７２）。（４）判別手段９１４は、「来る」の判別情報の１桁目
を抜き出し、定数記憶部１０に記憶されている値と比較
する（ステップ１１２，１１３）。（５）「来る」の判別情報の１桁目は０で定数記憶部１
０に記憶されている値と等しくないので、判別手段９１
４は共起検索部９６に０を返す（ステップ１１４，１１
８）。（６）共起検索部９６は、検索バッファ７に０を書き込
み検索を終了する（ステップ７３）。（７）解析手段５は、検索結果を記録し、次に候補２中
の単語「車」と「運ぶ」の共起関係を調べる。（８）「車」と「運ぶ」を検索バッファ７に書き込み、
共起検索部９６を呼ぶ（ステップ７１）。（９）共起検索部９６は、判別手段９１４を呼ぶ（ステ
ップ７２）。（１０）判別手段９１４は、「車」の判別情報の１桁目
を調べる（ステップ１１２，１１３）。（１１）「車」の判別情報の１桁目は１で定数記憶部１
０に記憶されている値と等しいので、判別手段９１４は
次の単語の判別情報を調べる（ステップ１１５，１１
６）。（１２）判別手段９１４は、「運ぶ」の判別情報の２桁
目を調べる（ステップ１１２，１１３）。（１３）「運ぶ」の判別情報の２桁目は１で、かつ「運
ぶ」が最後の単語なので、判別手段９１４は共起検索部
９６に１を返す（ステップ１１５−１１７）。（１４）共起検索部９６は共起辞書４を検索する。
「車」と「運ぶ」とからなる共起関係が共起辞書４に登
録されているので検索バッファ７に１を出力する（ステ
ップ７４）。（１５）同様に候補３中の単語「来る」と「鼓舞」、候
補４中の単語「車」と「鼓舞」について判別情報を調
べ、判別手段９１４から０を得て、共に検索結果は０と
なる。候補１−４について共起検索部６による処理が終
わったあと、解析手段５は、共起関係の登録されていた
候補２「車で運ぶ。」を変換結果として出力部から出力
する。The operation of the present embodiment will be described below with reference to kana-kanji conversion. Consider a case where a sentence “Kuru bu kobu.” Is input from the input unit 1 and the analysis unit 5 performs a kana-kanji conversion process. If necessary for explanation, word dictionary 9
3 is the example of FIG. 10, and the co-occurrence dictionary 4 is the example of FIG. As a result of conversion of the sentence "Until it comes," ・ [Candidate 1] “Carry until coming.” • [Candidate 2] “Carry by car.” • [Candidate 3] “Inspire until coming.” 4] There are four possible ways: "Inspire by car." In order to determine one of them, the co-occurrence dictionary is searched in the following order. (1) The co-occurrence relationship between the words “come” and “carry” in candidate 1 is examined. (2) Write "coming" and "carry" in the search buffer 7, and call the co-occurrence search unit 6 (step 71). (3) The co-occurrence search unit 96 calls the determination unit 914 (Step 72). (4) The determining means 914 extracts the first digit of the “coming” determination information and compares it with the value stored in the constant storage unit 10 (steps 112 and 113). (5) The first digit of the “coming” discrimination information is 0 and the constant storage unit 1
Since the value is not equal to the value stored in 0, the determination means 91
4 returns 0 to the co-occurrence search unit 96 (steps 114 and 11).
8). (6) The co-occurrence search unit 96 writes 0 in the search buffer 7 and ends the search (step 73). (7) The analysis means 5 records the search result, and then examines the co-occurrence relationship between the words "car" and "carry" in the candidate 2. (8) Write "car" and "carry" in the search buffer 7,
The co-occurrence search unit 96 is called (step 71). (9) The co-occurrence search unit 96 calls the determination unit 914 (step 72). (10) The determining means 914 checks the first digit of the "car" determination information (steps 112 and 113). (11) The first digit of the discrimination information of "car" is 1, and the constant storage unit 1
Since the value is equal to the value stored in 0, the determination means 914 checks the determination information of the next word (steps 115 and 11).
6). (12) The determination means 914 checks the second digit of the determination information of "carry" (steps 112 and 113). (13) The second digit of the discrimination information of "carry" is 1 and "carry" is the last word, so the discriminating means 914 returns 1 to the co-occurrence search unit 96 (steps 115 to 117). (14) The co-occurrence search unit 96 searches the co-occurrence dictionary 4.
Since the co-occurrence relationship consisting of "car" and "carry" is registered in the co-occurrence dictionary 4, 1 is output to the search buffer 7 (step 74). (15) Similarly, the discrimination information is checked for the words “coming” and “inspiring” in candidate 3 and the words “car” and “inspiring” in candidate 4, and 0 is obtained from the discriminating means 914. Becomes After the processing by the co-occurrence search unit 6 is completed for the candidates 1-4, the analysis unit 5 outputs from the output unit the candidate 2 “carried by car” registered in the co-occurrence relation as a conversion result.

【００４８】以上のように、本発明による文章解析装置
では、ここで挙げた例文をかな漢字変換する過程で共起
辞書４の検索が１回しか行われない。As described above, in the sentence analyzing apparatus according to the present invention, the co-occurrence dictionary 4 is searched only once in the process of converting the example sentence described above into kana-kanji characters.

【００４９】以下では、第３の発明の実施例について図
面を参照しながら説明する。Hereinafter, an embodiment of the third invention will be described with reference to the drawings.

【００５０】図１２は本発明の文章解析装置の一実施例
を示すブロック図である。この文章解析装置は、解析す
る文章を入力する入力部１，入力部１から入力された情
報を一時格納する入力バッファ２，単語の読みや品詞な
どの情報を登録している単語辞書１２３，単語の共起関
係を登録している共起辞書４，単語辞書１２３及び共起
辞書４を用いて入力バッファ２に格納されている文章を
解析する解析手段５，解析手段５によって指定された複
数の単語間の共起関係を共起辞書４から検索する共起検
索部１２６，検索すべき単語の情報及び検索結果を格納
する検索バッファ７，解析手段５による解析結果を一時
格納する出力バッファ８，出力バッファ８の解析結果を
出力する出力部９，検索バッファ７に格納されている複
数の単語の判別情報を互いに比較する比較部１２１０，
比較部１２１０を用いて共起辞書４の検索を行うかどう
かを判別する判別部１２１１，全体の動作の制御を行う
制御部１２１３を備えている。比較部１２１０と判別部
１２１１とは、判別手段１２１２を構成している。図１
２において、図１と同じ番号のものは同じものを示して
いるので説明を省略する。FIG. 12 is a block diagram showing one embodiment of the text analysis apparatus of the present invention. The sentence analyzing apparatus includes an input unit 1 for inputting a sentence to be analyzed, an input buffer for temporarily storing information input from the input unit 1, a word dictionary 123 for registering information such as word reading and part of speech, and a word dictionary 123. The co-occurrence dictionary 4, the word dictionary 123, and the co-occurrence dictionary 4 in which the co-occurrence relations are registered. A co-occurrence search unit 126 for searching for a co-occurrence relationship between words from the co-occurrence dictionary 4, a search buffer 7 for storing information on a word to be searched and a search result, an output buffer 8 for temporarily storing an analysis result by the analysis unit 5, An output unit 9 that outputs an analysis result of the output buffer 8; a comparison unit 1210 that compares discrimination information of a plurality of words stored in the search buffer 7 with each other;
A determination unit 1211 that determines whether to search the co-occurrence dictionary 4 using the comparison unit 1210 is provided with a control unit 1213 that controls the overall operation. The comparison unit 1210 and the determination unit 1211 constitute a determination unit 1212. FIG.
In FIG. 2, those having the same numbers as those in FIG.

【００５１】以下では、本実施例の動作を説明する。（ａ）制御部１２１３は、入力部１に文章の入力を指示
する。（ｂ）入力部１は、文章を入力し、入力バッファ２に格
納する。（ｃ）制御部１２１３は、解析手段５に文章解析の実行
を指示する。（ｄ）解析手段５によって入力バッファ２中の文章の解
析を行う。（ｅ）解析が終わると解析手段５は、結果を出力バッフ
ァ８に出力する。（ｆ）制御部１２１３は、出力部９に解析結果の出力を
指示する。（ｇ）出力部９は、出力バッファ８に格納されている解
析結果を出力する。Hereinafter, the operation of this embodiment will be described. (A) The control unit 1213 instructs the input unit 1 to input a sentence. (B) The input unit 1 inputs a sentence and stores it in the input buffer 2. (C) The control unit 1213 instructs the analysis unit 5 to execute a sentence analysis. (D) The analysis means 5 analyzes the text in the input buffer 2. (E) When the analysis is completed, the analysis means 5 outputs the result to the output buffer 8. (F) The control unit 1213 instructs the output unit 9 to output the analysis result. (G) The output unit 9 outputs the analysis result stored in the output buffer 8.

【００５２】図１３は、本実施例の単語辞書１２３の構
成例を示す概念図である。図１３で、Ａが単語の表記、
Ｂが単語の読み、Ｃが単語の品詞、Ｄが判別情報であ
る。他の単語との共起関係が共起辞書４に登録されてい
ない場合、判別情報は共起関係が登録されていないこと
を示す値（以下の説明では例として０を用いる）であ
る。共起関係が登録されている場合は、判別情報は正の
整数値をとる。この判別情報は、共起関係の登録されて
いる単語同士では同じ値になるように選択する。したが
って、判別情報が違う値の単語同士の共起関係は共起辞
書４に登録されていない（同じ値なら共起関係が登録さ
れているとは限らない）。FIG. 13 is a conceptual diagram showing an example of the configuration of the word dictionary 123 of this embodiment. In FIG. 13, A is a notation of a word,
B is the reading of the word, C is the part of speech of the word, and D is the discrimination information. When the co-occurrence relation with another word is not registered in the co-occurrence dictionary 4, the discrimination information is a value indicating that the co-occurrence relation is not registered (0 is used as an example in the following description). When the co-occurrence relationship is registered, the discrimination information takes a positive integer value. The discrimination information is selected so that words registered in the co-occurrence relation have the same value. Therefore, the co-occurrence relation between words having different values of the discrimination information is not registered in the co-occurrence dictionary 4 (the co-occurrence relation is not necessarily registered if the value is the same).

【００５３】以下では、単語辞書１２３の単語に上記の
性質を持つ判別情報を与える方法の一例を説明する。図
１４は、判別情報を各単語に与える方法を示す流れ図で
ある。以下、図１４に従って説明する。判別情報を与え
る単語数をｎ、各単語をＷ₁，Ｗ₂，・・・，Ｗ_n、単語
Ｗ_iの判別情報をＷ_i・ｆと表す。（ａ）Ｗ_i・ｆ←−１（１≦ｊ≦ｎ）ｈ←１ｉ←１（ステップ１４１）（ｂ）Ｗ_i+1，・・・，Ｗ_nのうちで、Ｗ_iとの共起関係
が登録されている単語で、かつ、判別情報が−１のもの
をすべて求める。求めた単語の個数をｍ、単語をＷ_a1，
Ｗ_a2，・・・，Ｗ_aj，・・・，Ｗ_am（ｉ≦ａｊ≦ｎ）と
する（ステップ１４２）。（ｃ）Ｗ_i・ｆ＝−１かつｍ＝０なら、Ｗ_i・ｆ←０とし
て（ｆ）へ（ステップ１４３）。（ｄ）Ｗ_i・ｆ＝−１なら、Ｗ_i・ｆ＝ｈ、Ｗ_aj. ｆ←ｈ
（１≦ｊ≦ｍ）、ｈ←ｈ＋１として（ｆ）へ（ステップ
１４４）。（ｅ）Ｗ_i・ｆ≠−１なら、Ｗ_aj・ｆ←Ｗ_i・ｆ（１≦ｊ
≦ｍ）とする（ステップ１４５）。（ｆ）ｉ←ｉ＋１（ステップ１４６）（ｇ）ｉ≦ｎなら（ｂ）へ戻る（ステップ１４７）。（ｈ）終了図１５は、前記の方法による判別情報の値と共起辞書４
に登録されている共起関係の関係を示す概念図である。
図１５で破線で囲んである単語は同じ判別情報の値（図
１５で四角で囲んである数値）を持っている。図１５に
示されているように、直接・間接に共起関係にある単語
の判別情報は同じ値になり、判別情報の値が違う単語の
間には共起関係が登録されていない。In the following, an example of a method for providing the words in the word dictionary 123 with the discriminating information having the above properties will be described. FIG. 14 is a flowchart showing a method of giving discrimination information to each word. Hereinafter, description will be made with reference to FIG. W _1, W ₂ the number of words n, each word that gives the discrimination information represents · · ·, W _n, the discrimination information words W _i and W _i · f. _{(A) W i · f ←} -1 (1 ≦ j ≦ n) h ← 1 i ← 1 ( step _{141) (b) W i +} 1, ···, among W _n, co with W _i All words whose origin is registered and whose discrimination information is -1 are obtained. The number of words obtained is m, and the words are W _a1 ,
_{_{W a2, ···, W aj,}} ···, and _{W am (i ≦ aj ≦ n} ) ( step 142). (C) If _Wi · f = −1 and m = 0, set _Wi · f ← 0 to (f) (step 143). (D) If _{W i · f = -1, W} i · f = h, W aj. F ← h
(1.ltoreq.j.ltoreq.m), and h.ltoreq.h + 1, and then to (f) (step 144). (E) If _{W i · f ≠ -1, W} aj · f ← W i · f (1 ≦ j
≤ m) (step 145). (F) i ← i + 1 (step 146) (g) If i ≦ n, return to (b) (step 147). (H) End FIG. 15 shows the values of the discrimination information and the co-occurrence dictionary 4 according to the above method.
FIG. 4 is a conceptual diagram showing a co-occurrence relationship registered in FIG.
The words surrounded by a broken line in FIG. 15 have the same discrimination information value (numerical values surrounded by a square in FIG. 15). As shown in FIG. 15, the discrimination information of words that are directly and indirectly co-occurring has the same value, and no co-occurrence relation is registered between words having different discrimination information values.

【００５４】判別手段１２１２は検索バッファ７から単
語の判別情報を読み込み、検索バッファ７中のすべての
単語の判別情報が等しいかどうかを比較部１２１０を用
いて調べる。そして、判別手段１２１２は、検索バッフ
ァ７中の単語からなる共起関係が共起辞書４に登録され
ていないことを示す値（以下の説明では例として０を用
いる）、または、共起辞書４を検索する必要があること
を示す値（以下の説明では例として１を用いる）のいず
れかを共起検索部１２６に返す。The discriminating means 1212 reads the discrimination information of the word from the search buffer 7 and checks whether or not the discrimination information of all the words in the search buffer 7 is equal by using the comparing unit 1210. Then, the determination unit 1212 outputs a value indicating that the co-occurrence relation composed of words in the search buffer 7 is not registered in the co-occurrence dictionary 4 (0 is used as an example in the following description), or Is returned to the co-occurrence search unit 126, indicating that it is necessary to search for (1) in the following description.

【００５５】比較部１２１０は検索バッファ７中の単語
の判別情報を判別部１２１１から受け取り、判別情報の
値が０のものはないかを調べ、０でないならばすべての
判別情報の値が等しいかどうかを調べる。The comparing unit 1210 receives the discriminating information of the word in the search buffer 7 from the discriminating unit 1211 and checks whether or not the discriminating information has a value of 0. Find out if.

【００５６】図１６は本実施例における判別手段１２１
２の動作の一例を示す流れ図である。以下、図１６に従
って判別手段１２１２の動作を説明する。説明にあた
り、検索バッファ７中の単語数をｎ、ｉ番目の単語をＷ
_i、単語Ｗ_iの判別情報をＷ_i・ｆと表す。（ａ）ｉ←１とする。比較部１２１０の初期化を行う
（ステップ１６１）。（ｂ）Ｗ_i・ｆを比較部１２１０にわたす（ステップ１
６２）。（ｃ）比較部が０を返したなら（ｇ）へ（ステップ１６
３）。（ｄ）ｉ←ｉ＋１（ステップ１６４）（ｅ）ｉ≦ｎなら（ｂ）へ戻る（ステップ１６５）。（ｆ）共起検索部１２６に１を返し、終了する（ステッ
プ１６６）。（ｇ）共起検索部１２６に０を返し、終了する（ステッ
プ１６７）。FIG. 16 shows the determining means 121 in this embodiment.
9 is a flowchart showing an example of the operation 2; Hereinafter, the operation of the determination unit 1212 will be described with reference to FIG. In the explanation, the number of words in the search buffer 7 is n, and the i-th word is W
_i , and the discrimination information of the word W _i is represented by W _i · f. (A) Let i ← 1. The comparison unit 1210 is initialized (step 161). The (b) W _i · f pass to the comparator 1210 (Step 1
62). (C) If the comparison unit returns 0, go to (g) (step 16).
3). (D) i ← i + 1 (step 164) (e) If i ≦ n, return to (b) (step 165). (F) Return 1 to the co-occurrence search unit 126 and end (step 166). (G) Return 0 to the co-occurrence search unit 126 and end (step 167).

【００５７】図１７は本実施例における比較部１２１０
の動作の一例を示す流れ図である。以下、図１７に従っ
て比較部１２１０の動作を説明する。説明にあたり、判
別部１２１１からわたされる判別情報をＷ．ｆと表す。
またｆは、判別情報を記憶するためのメモリである。（ａ）初期化の指示があったなら、ｆ＝−１として終了
（ステップ１７１，１７２）。（ｂ）Ｗ．ｆ＝０なら（ｆ）へ（ステップ１７３）。（ｃ）ｆ＝−１ならｆ←Ｗ．ｆとして（ｅ）へ（ステッ
プ１７４，１７５）。（ｄ）Ｗ．ｆ≠ｆなら（ｆ）へ（ステップ１７６）。（ｅ）判別部１２１１に１を返し、終了する（ステップ
１７７）。（ｆ）判別部１２１１に０を返し、終了する（ステップ
１７８）。FIG. 17 shows a comparison section 1210 in this embodiment.
5 is a flowchart showing an example of the operation of the first embodiment. Hereinafter, the operation of the comparison unit 1210 will be described with reference to FIG. In the description, the discrimination information passed from the discrimination unit 1211 is referred to as W.C. Expressed as f.
F is a memory for storing the discrimination information. (A) If there is an instruction for initialization, the process is terminated with f = -1 (steps 171 and 172). (B) W. If f = 0, go to (f) (step 173). (C) If f = -1, then f ← W. Go to (e) as f (steps 174, 175). (D) W.I. If f ≠ f, go to (f) (step 176). (E) Return 1 to the determination unit 1211 and end (step 177). (F) Return 0 to the determination unit 1211 and end (step 178).

【００５８】以下に、かな漢字変換を例にして本実施例
の動作について説明する。「くるまではこぶ。」という
文を入力部１から入力し、解析手段５でかな漢字変換処
理を行う場合を考える。説明に必要な場合、単語辞書１
２３は図１３、共起辞書４は図４の例を用いる。「くる
まではこぶ。」という文の変換結果として、・［候補１］「来るまで運ぶ。」・［候補２］「車で運ぶ。」・［候補３］「来るまでは鼓舞。」・［候補４］「車では鼓舞。」という４通りが考えられる。この中の１つに決定するた
めに、以下の順序で共起辞書の検索を行う。（１）候補１中の単語「来る」と「運ぶ」の共起関係を
調べる。（２）「来る」と「運ぶ」を検索バッファ７に書き込
み、共起検索部１２６を呼ぶ（ステップ７１）。（３）共起検索部１２６は、判別手段１２１２を呼ぶ
（ステップ７２）。（４）判別手段１２１２は、比較部１２１０の初期化を
行う（ステップ１６１）。比較部１２１０ではメモリｆ
に−１が記憶される（ステップ１７１，１７２）。
（５）判別手段１２１２は、比較部１２１０で「来る」
の判別情報を調べる（ステップ１６２）。（６）「来る」の判別情報は２（≠０）でｆ＝−１なの
で、比較部１２１０はｆ←２として判別手段１２１２に
１を返す（ステップ１７５，１７７）。（７）判別手段１２１２は、比較部１２１０で「運ぶ」
の判別情報を調べる（ステップ１６２）。（８）「運ぶ」の判別情報は１（≠０）でｆ（＝２）と
は等しくないので、比較部１２１０は判別手段１２１２
に０を返す（ステップ１７６，１７８）。（９）判別手段１２１２は，共起検索部６に０を返す
（ステップ１６３，１６７）。（１０）共起検索部６は、検索バッファ７に０を書き込
み検索を終了する（ステップ７３）。（１１）解析手段５は、検索結果を記録し、次に候補２
中の単語「車」と「運ぶ」の共起関係を調べる。（１２）「車」と「運ぶ」を検索バッファ７に書き込
み、共起検索部６を呼ぶ（ステップ７１）。（１３）共起検索部６は、判別手段１２１２を呼ぶ（ス
テップ７２）。（１４）判別手段１２１２は、比較部１２１０の初期化
を行う（ステップ１６１）。比較部１２１０ではメモリ
ｆに−１が記憶される（ステップ１７１，１７２）。（１５）判別手段１２１２は、比較部１２１０で「車」
の判別情報を調べる（ステップ１６２）。（１６）「車」の判別情報は１（≠０）でｆ＝−１なの
で、比較部１２１０はｆ←２として判別手段１２１２に
１を返す（ステップ１７５，１７７）。（１７）判別手段１２１２は、比較部１２１０で「運
ぶ」の判別情報を調べる（ステップ１６２）。（１８）「運ぶ」の判別情報は１（≠０）でｆ（＝１）
と等しいので、比較部１２１０は判別手段１２１２に１
を返す（ステップ１７６，１７７）。（１９）「運ぶ」が最後の単語なので、判別手段１２１
２は共起検索部６に１を返す（ステップ１６４−１６
６）。（２０）共起検索部６は、共起辞書４を検索する。
「車」と「運ぶ」とからなる共起関係が共起辞書４に登
録されているので検索バッファ７に１を出力する（ステ
ップ７４）。（２１）続いて候補３中の単語「来る」と「鼓舞」の判
別情報を調べ、「鼓舞」の判別情報が０なので、判別手
段１２１２は共起検索部６に０を返し、共起検索部６は
検索バッファ７に０を書き込み検索を終了する。（２２）最後に候補４中の単語「車」と「鼓舞」の判別
情報を調べ、「鼓舞」の判別情報が０なので、判別手段
１２１２は共起検索部６に０を返し、共起検索部６は検
索バッファ７に０を書き込み検索を終了する。候補１−
４について共起検索部６による処理が終わったあと、解
析手段５は、共起関係の登録されていた候補２「車で運
ぶ。」を変換結果として出力部から出力する。Hereinafter, the operation of the present embodiment will be described by taking kana-kanji conversion as an example. Consider a case where a sentence “Kuru bu kobu.” Is input from the input unit 1 and the analysis unit 5 performs a kana-kanji conversion process. Word dictionary 1 if needed for explanation
Reference numeral 23 denotes the example of FIG. 13, and co-occurrence dictionary 4 uses the example of FIG. As a result of the conversion of the sentence "Until it comes," ・ [Candidate 1] “Carry until coming.” • [Candidate 2] “Carry by car.” • [Candidate 3] “Inspire until coming.” ・ [Candidate 4] There are four possible ways: "Inspire by car." In order to determine one of them, the co-occurrence dictionary is searched in the following order. (1) The co-occurrence relationship between the words “come” and “carry” in candidate 1 is examined. (2) Write “coming” and “carry” in the search buffer 7 and call the co-occurrence search unit 126 (step 71). (3) The co-occurrence search unit 126 calls the determination unit 1212 (step 72). (4) The determination unit 1212 initializes the comparison unit 1210 (Step 161). In the comparing unit 1210, the memory f
Is stored (steps 171 and 172).
(5) The determination unit 1212 “comes” in the comparison unit 1210.
The discrimination information is checked (step 162). (6) Since the determination information of “coming” is 2 (≠ 0) and f = −1, the comparison unit 1210 returns 1 to the determination unit 1212 as f ← 2 (steps 175 and 177). (7) The discriminating means 1212 is "carried" by the comparing unit 1210.
Is checked (step 162). (8) Since the determination information of “carry” is 1 (≠ 0) and is not equal to f (= 2), the comparing unit 1210 uses the determination unit 1212.
Is returned (steps 176 and 178). (9) The determination means 1212 returns 0 to the co-occurrence search unit 6 (steps 163 and 167). (10) The co-occurrence search unit 6 writes 0 in the search buffer 7 and ends the search (step 73). (11) The analysis unit 5 records the search result, and then records the candidate 2
Examine the co-occurrence relationship between the words "car" and "carry". (12) Write "car" and "carry" in the search buffer 7, and call the co-occurrence search unit 6 (step 71). (13) The co-occurrence search unit 6 calls the determination means 1212 (step 72). (14) The determination unit 1212 initializes the comparison unit 1210 (Step 161). In the comparing unit 1210, -1 is stored in the memory f (steps 171 and 172). (15) The determination unit 1212 uses the “car” in the comparison unit 1210.
Is checked (step 162). (16) Since the discrimination information of “car” is 1 (≠ 0) and f = −1, the comparing unit 1210 returns 1 to the discriminating unit 1212 as f ← 2 (steps 175 and 177). (17) The discriminating means 1212 checks discrimination information of “carry” by the comparing unit 1210 (step 162). (18) The discrimination information of “carry” is 1 ($ 0) and f (= 1)
Therefore, the comparing unit 1210 outputs 1
Is returned (steps 176 and 177). (19) Since "carry" is the last word, the determination means 121
2 returns 1 to the co-occurrence search unit 6 (step 164-16)
6). (20) The co-occurrence search unit 6 searches the co-occurrence dictionary 4.
Since the co-occurrence relationship consisting of "car" and "carry" is registered in the co-occurrence dictionary 4, 1 is output to the search buffer 7 (step 74). (21) Subsequently, the discriminating information of the words “coming” and “inspiring” in the candidate 3 is examined. The unit 6 writes 0 in the search buffer 7 and ends the search. (22) Finally, the discrimination information of the words “car” and “inspired” in the candidate 4 is checked, and since the discrimination information of “inspired” is 0, the discriminating means 1212 returns 0 to the co-occurrence search unit 6 and The unit 6 writes 0 in the search buffer 7 and ends the search. Candidate 1
After the processing of the co-occurrence search unit 6 is completed with respect to No. 4, the analysis unit 5 outputs the candidate 2 “carried by car” registered in the co-occurrence relation from the output unit as a conversion result.

【００５９】以上のように、本発明による文章解析装置
では、ここで挙げた例文をかな漢字変換する過程で共起
辞書４の検索が１回しか行われない。As described above, in the sentence analyzing apparatus according to the present invention, the co-occurrence dictionary 4 is searched only once during the process of converting the example sentence described above into kana-kanji characters.

【００６０】[0060]

【発明の効果】以上説明した３つの発明では、共起辞書
に登録されていない単語の組み合わせを見分けるための
おのおの異なった形式の判別情報を単語辞書に付与し、
その判別情報を用いて入力された単語の組み合わせが共
起辞書に登録されていないことを判別する判別部を備え
ることにより、共起辞書の検索の一部を省略でき、辞書
検索の時間を削減できる効果がある。According to the three inventions described above, different types of discrimination information for distinguishing combinations of words not registered in the co-occurrence dictionary are added to the word dictionary.
Equipped with a discriminator that discriminates that the combination of words entered using the discrimination information is not registered in the co-occurrence dictionary, a part of the co-occurrence dictionary search can be omitted, and the dictionary search time can be reduced There are effects that can be done.

[Brief description of the drawings]

【図１】第１の発明の文章解析装置の一実施例を示すブ
ロック図である。FIG. 1 is a block diagram showing an embodiment of a text analysis device according to the first invention.

【図２】従来の文章解析装置の一構成例を示すブロック
図である。FIG. 2 is a block diagram illustrating a configuration example of a conventional text analysis device.

【図３】図２の単語辞書の構成例を示す概念図である。FIG. 3 is a conceptual diagram showing a configuration example of the word dictionary of FIG. 2;

【図４】共起辞書４の構成例を示す概念図である。FIG. 4 is a conceptual diagram showing a configuration example of a co-occurrence dictionary 4.

【図５】図２の文章解析装置の共起検索の手順を示す流
れ図である。FIG. 5 is a flowchart showing a procedure of a co-occurrence search of the text analysis apparatus of FIG. 2;

【図６】図１の単語辞書の構成例を示す概念図である。FIG. 6 is a conceptual diagram showing a configuration example of the word dictionary of FIG. 1;

【図７】本発明の共起検索部の動作の一例を示す流れ図
である。FIG. 7 is a flowchart showing an example of the operation of the co-occurrence search unit of the present invention.

【図８】図１の判別部の動作の一例を示す流れ図であ
る。FIG. 8 is a flowchart illustrating an example of the operation of the determination unit in FIG. 1;

【図９】第２の発明の文章解析装置の一実施例を示すブ
ロック図である。FIG. 9 is a block diagram showing an embodiment of a text analysis device according to the second invention.

【図１０】図９の単語辞書の構成例を示す概念図であ
る。FIG. 10 is a conceptual diagram showing a configuration example of the word dictionary of FIG. 9;

【図１１】図９の判別部の動作の一例を示す流れ図であ
る。FIG. 11 is a flowchart illustrating an example of the operation of the determination unit in FIG. 9;

【図１２】第３の発明の文章解析装置の一実施例を示す
ブロック図である。FIG. 12 is a block diagram showing one embodiment of a text analysis device according to the third invention.

【図１３】図１２の単語辞書の構成例を示す概念図であ
る。FIG. 13 is a conceptual diagram showing a configuration example of the word dictionary of FIG.

【図１４】図１２の単語辞書の単語に判別情報を付加す
る手順を示す流れ図である。14 is a flowchart showing a procedure for adding discrimination information to words in the word dictionary of FIG.

【図１５】単語の共起関係と図１４の方法で与えた判別
情報との関係を示す概念図である。FIG. 15 is a conceptual diagram showing a relationship between co-occurrence relationships of words and discrimination information given by the method of FIG.

【図１６】図１２の判別部の動作の一例を示す流れ図で
ある。FIG. 16 is a flowchart showing an example of the operation of the determination unit in FIG. 12;

【図１７】図１２の比較部の動作の一例を示す流れ図で
ある。FIG. 17 is a flowchart illustrating an example of the operation of the comparison unit in FIG. 12;

[Explanation of symbols]

１入力部２入力バッファ３単語辞書４共起辞書５解析手段６，２６，１２６共起検索部７検索バッファ８出力バッファ９出力部１０判別部１１比較部１２判別手段１３制御部 DESCRIPTION OF SYMBOLS 1 Input part 2 Input buffer 3 Word dictionary 4 Co-occurrence dictionary 5 Analysis means 6, 26, 126 Co-occurrence search part 7 Search buffer 8 Output buffer 9 Output part 10 Judgment part 11 Comparison part 12 Judgment means 13 Control part

Claims

(57) [Claims]

A plurality of words having a co-occurrence relationship with an input unit for inputting a sentence to be analyzed, a word dictionary in which reading, notation, and part of speech of words are registered in a file device or a memory; A co-occurrence dictionary that has registered a combination of the same on a file device or a memory, a co-occurrence search unit that searches the co-occurrence dictionary, and a sentence input from the input unit using the word dictionary and the co-occurrence dictionary. In a sentence analysis apparatus comprising: an analysis unit that performs analysis, and an output unit that outputs an analysis result, the word dictionary is configured such that, for each registered word, the co-occurrence relationship between the word and another word is the co-occurrence. A discriminating unit that has discrimination information indicating whether or not the co-occurrence dictionary is registered in the dictionary and that determines a combination of words whose co-occurrence relation is not registered in the co-occurrence dictionary with reference to the discrimination information; Section is determined by the determination section Text analysis apparatus characterized by omitting the search of the cooccurrence dictionary in accordance.

2. A plurality of words having a co-occurrence relationship with an input unit for inputting a sentence to be analyzed, a word dictionary in which reading, notation, and part of speech of words are registered in a file device or a memory. A co-occurrence dictionary that has registered a combination of the same on a file device or a memory, a co-occurrence search unit that searches the co-occurrence dictionary, and a sentence input from the input unit using the word dictionary and the co-occurrence dictionary. In a sentence analysis apparatus comprising: an analysis unit for performing analysis, and an output unit that outputs an analysis result, wherein the co-occurrence dictionary registers a combination of a plurality of words having a co-occurrence relationship by arranging words in order, and Indicates whether a co-occurrence relationship between that word and other words for each registered word is registered in the co-occurrence dictionary,
If it is registered, it has discrimination information indicating the number of the word registered, and refers to the discrimination information to determine a combination of words whose co-occurrence relation is not registered in the co-occurrence dictionary. A sentence analysis device, wherein the co-occurrence search unit omits a search of the co-occurrence dictionary in accordance with the determination of the determination unit.

3. A plurality of words having a co-occurrence relationship with an input unit for inputting a sentence to be analyzed, a word dictionary in which reading, notation, and part of speech of words are registered in a file device or a memory. A co-occurrence dictionary that has registered a combination of the same on a file device or a memory, a co-occurrence search unit that searches the co-occurrence dictionary, and a sentence input from the input unit using the word dictionary and the co-occurrence dictionary. In a sentence analysis apparatus including an analysis unit for performing analysis and an output unit for outputting an analysis result, the word dictionary is configured such that words having a co-occurrence relationship registered in the co-occurrence dictionary for each word have the same value. Having a discrimination information set in the co-occurrence dictionary, and determining a combination of words whose co-occurrence relation is not registered in the co-occurrence dictionary with reference to the discrimination information. According to the judgment of the department Sentence analysis apparatus characterized in that it omitted a search of the cause dictionary.