JPH0869463A

JPH0869463A - Device and method for japanese syllabary and chinese character conversion

Info

Publication number: JPH0869463A
Application number: JP7180785A
Authority: JP
Inventors: Yasuo Koyama; 泰男小山
Original assignee: EE I SOFUTO KK
Current assignee: EE I SOFUTO KK
Priority date: 1994-06-22
Filing date: 1995-06-22
Publication date: 1996-03-12
Anticipated expiration: 2022-01-24
Also published as: JP3873299B2

Abstract

PURPOSE: To obtain a desired paragraph space-interposed writing candidate by making modifications in consideration of causative passive cases and derivative notation and precisely dividing paragraphs with interposed spaces. CONSTITUTION: For a paragraph space-interposed writing process, the words of the latter paragraph are noticed and it is judged whether or not the word has modification information (step S300); when an attached word is not allowed although the corresponding modifying word is present, a causative and passive modification verifying process is performed (step S342). For causative and passive cases, allowed attached words change, so this is verified and when the attached word is allowed, the modification is formed and this is regarded as a paragraph candidate; and the range from the modified word to the modifying word is excluded from the retrieval range on and after next-time modifications (steps S350 and S360). Consequently, the formation of modifications is decided in consideration of the causative and passive cases and the possibility that a desired paragraph space-interposed writing is obtained can be improved. Further, the modifications are verified by using only representative notation.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、仮名漢字変換装置およ
び仮名漢字変換方法に関し、詳しくは単語間の係り受け
の情報を、文節分かち書きもしくは単語の漢字候補の選
択に利用する仮名漢字変換装置および仮名漢字変換方法
に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a kana-kanji conversion device and kana-kanji conversion method. Kana-Kanji conversion method.

【０００２】[0002]

【従来の技術】従来、キーボードなどから入力された仮
名文字列を、所望の仮名漢字混じり文に変換する仮名漢
字変換装置が、日本語文の入力装置として、あるいは日
本語文の編集装置として、種々提案されている。こうし
た仮名漢字変換装置としては、使用者が単語や文節の区
切り位置をいちいち指定する必要がなく、しかも変換後
の文字列は使用者が望んだ表記となるものが望まれてい
る。日本語には、同音異議語や同訓異議語が多数存在す
ることから、誤りなく所望の仮名漢字混じり文を得るた
めには、最終的には文の意味を解析しなければならない
が、意味を解析するためには、少なくとも有機的に関連
づけられた数万に上る言葉の知識ベースが必要となり、
実現は極めて困難である。2. Description of the Related Art Conventionally, various kana-kanji conversion devices for converting a kana character string input from a keyboard into a desired kana-kanji mixed sentence have been proposed as an input device for Japanese sentences or an editing device for Japanese sentences. Has been done. As such a kana-kanji conversion device, it is desired that the user does not need to specify the delimiter positions of words and phrases one by one, and that the converted character string has the notation desired by the user. In Japanese, since there are many homophones and homonyms, in order to obtain the desired mixed kana-kanji sentence without error, the meaning of the sentence must be finally analyzed. Analysis requires a knowledge base of at least tens of thousands of organically related words,
Realization is extremely difficult.

【０００３】そこで、従来の仮名漢字変換装置では、文
節分かち書きの処理や、同音異議語の選択における学習
処理を工夫し、意味を解析することなく、使用者が望む
結果が得られるよう試みている。文節分かち書きの処理
としては、２文節を基本単位とし成り立ち得る文節の中
で最長の文節が得られる文節を第１候補とする２文節最
長一致法や、文節を構成する単語の候補となり得る単語
および単語同士の組合わせにコストを付け、この点数が
所定の条件を満たす文節を第１候補とする最小コスト法
などがある。また、学習処理には、同音異議語の中から
直前に使用者が選択した単語を最優先で次回の候補とす
る同音異議語の学習や、ある単語を含んだ文節の長さと
して直前に使用者が指定した長さを最優先とする文節長
の学習などが知られている。Therefore, in the conventional kana-to-kanji conversion device, devising the process of segmenting and writing bunsetsu and the process of selecting homonyms and oppositions, an attempt is made to obtain the desired result without analyzing the meaning. . The bunsetsu segmentation processing includes a 2-bunsetsu longest matching method in which a bunsetsu that gives the longest bunsetsu that can be formed with two bunsetsu as a basic unit is the first candidate, and a word that can be a candidate for a word forming a bunsetsu There is a minimum cost method in which a cost is added to a combination of words and a phrase whose score satisfies a predetermined condition is set as a first candidate. In the learning process, the most recently selected homophonic word from the homonym is used as the next candidate for learning the homonym object, and the length of the phrase containing a word is used immediately before. It is known to learn bunsetsu length, which gives top priority to the length specified by the person.

【０００４】更に、最近では、単語同士の特定の関係
（例えば、「熱いお茶」の「熱い」と「お茶」、あるい
は「暑い夏」の「暑い」と「夏」）に着目し、この関係
を記憶した辞書を用意することで、一方の単語（例えば
「お茶」）が特定されたとき、この単語に関係のある言
葉（例えば「熱い」）を第１候補として選択するものも
提案されている（例えば特開平３−１０５６６４号の
「かな漢字変換装置」や特開平４−２７７８６１号公報
の「かな漢字変換装置」など）。こうした単語間の特定
の関係は、「係り受け」あるいは「共起」と呼ばれる。Furthermore, recently, attention has been paid to a specific relationship between words (for example, "hot" and "tea" in "hot tea" or "hot" and "summer" in "hot summer", and this relationship is considered. By preparing a dictionary that stores, when one word (eg, “tea”) is specified, a word related to this word (eg, “hot”) is selected as the first candidate. (For example, “Kana-Kanji conversion device” of Japanese Patent Laid-Open No. 3-105664 and “Kana-Kanji conversion device” of Japanese Patent Laid-Open No. 4-2777861). The particular relationship between these words is called "dependency" or "co-occurrence."

【０００５】[0005]

【発明が解決しようとする課題】しかしながら、これら
の仮名漢字変換装置では、一旦文節分かち書きを行なっ
て得られた文節における単語同士の関係を見ているに過
ぎないので、文節分かち書きが誤っていれば、せっかく
用意した単語間の関係を記憶した辞書も役に立たない。
そもそも、単語間の関係としてせいぜい隣接する単語間
の関係を見ているに過ぎないので、日本語として最も自
然な仮名漢字混じり文を得ることができない場合があっ
た。かといって、むやみに単語間の関係を検討する範囲
を広げれば、その組合わせの数は、入力した仮名文字の
数の増加に応じて等比級数的に増加するから、仮名漢字
変換の完了までに、許容できない長い時間がかかってし
まう。However, these kana-to-kanji conversion devices only look at the relationship between words in a bunsetsu obtained by performing bunsetsu segmentation, so if the segmentation syllabary is incorrect. , A dictionary that remembers the relations between the prepared words is also useless.
In the first place, since we were only looking at the relationship between adjacent words as the relationship between words at most, there were cases where we could not obtain the most natural kana-kanji mixed sentence in Japanese. However, if the scope of studying the relationship between words is unnecessarily widened, the number of combinations will increase in a geometric progression as the number of input kana characters increases. By the time it takes an unacceptably long time.

【０００６】また、こうした単語間の関係として、単純
に２以上の単語が近接して使用される程度の情報（以
下、広い意味で「共起」と呼ぶ）では、日本語における
助詞の役割が看過され、正しい変換結果を得ることがで
きない。例えば「記者」と「帰社（する）」が共起関係
にあるという情報を記憶しているだけでは、「きしゃが
きしゃする」は「記者が帰社する」に正しく変換できて
も、「きしゃにきしゃする」では、「記者に帰社する」
となってしまい正しい変換とならない。そこで、係り受
けの情報として、「名詞」＋「助詞」＋「用言」（例え
ば「記者」＋「が」または「は」＋「帰社する」）のよ
うに、係り語と受け語の間に許容される付属語（助詞な
ど）の情報を含めて係り受け情報とし、これを記憶する
方法が考えられるが、日本語の場合、受動や使役などの
表現では、付属語である助詞が変化するため、今度は、
「きしゃにきしゃさせる」を正しく変換できないという
問題を招致する。とはいえ、受動や使役の場合の係り受
けの成立を認めるために、許容する付属語の範囲を広げ
たのでは、係り受けによる仮名漢字変換の精度が低下し
てしまう。Further, regarding such a relationship between words, in the information that two or more words are simply used in close proximity (hereinafter referred to as "co-occurrence" in a broad sense), the role of a particle in Japanese is It is overlooked and the correct conversion result cannot be obtained. For example, even if “Kishakushasha” can be correctly converted to “Reporter returns to work” by simply storing the information that “reporter” and “return to work” have a co-occurrence relationship, "I will return to the reporter"
Will not be the correct conversion. Therefore, as the information on the modification, the relationship between the modification word and the modification word, such as "noun" + "particle" + "synonym" (for example, "reporter" + "ga" or "ha" + "return to work") It may be possible to memorize this as dependent information that includes the information of adjuncts (particles, etc.) that are allowed in, but in the case of Japanese, in the expressions such as passive and causative, the adjunct particles are changed. In order to do this,
Invites a problem that cannot properly convert "Crush". However, if the range of allowable adjunct words is expanded in order to recognize the establishment of dependency in the case of passive or causative, the accuracy of kana-kanji conversion by dependency will decrease.

【０００７】更に、日本語の場合、「受付」と「受け付
け」、あるいは「受付け」のように、同じ言葉であって
も異なる表記が許されているものがあるという特徴があ
り、これらの表記に関して係り受けをすべて認めようと
すると、総ての表記の単語を単語辞書に登録するだけで
なく、係り受け辞書にも登録しておかねばならないとい
う問題があった。派生的なこれらの表記をすべて係り受
けの対象とするために総ての派生表記を登録しておく
と、係り受け辞書の容量が極めて大きなものとなってし
まうばかりでなく、係り受けの検定に要する時間も増大
し、許容できないものとなってしまうことも考えられ
た。Further, in the case of Japanese, there is a feature that different notations are allowed even for the same word, such as "acceptance" and "acceptance", or "acceptance". There was a problem that if all the changes were to be accepted, not only all the words in the notation should be registered in the word dictionary, but also in the change dictionary. If all the derivative notations are registered in order to make all these derivative notations subject to dependency, not only will the capacity of the dependency dictionary become extremely large, but it will It was also thought that the time required would increase and would be unacceptable.

【０００８】本発明の仮名漢字変換装置および仮名漢字
変換方法は、こうした問題を解決し、入力した文字列の
分かち書きや漢字候補の優先順位の変更を、使役や受動
を勘案した係り受けの関係を利用して行ない、所望の仮
名漢字混じり文を得ることを目的としてなされ、次の構
成を採った。The kana-kanji conversion device and the kana-kanji conversion method of the present invention solve these problems and provide a dependency relationship for separating the input character string and changing the priority of kanji candidates in consideration of causative and passive. It was done for the purpose of obtaining a desired mixed Kana-Kanji sentence and adopted the following structure.

【０００９】[0009]

【課題を解決するための手段および作用】本発明の仮名
漢字変換装置は、仮名文字列を入力し、単語辞書を参照
して、該入力された仮名文字列を文節分かち書きして文
節分かち書き候補を作成し、該文節分かち書き候補を用
いて仮名漢字混じり文を構成する候補文字列を生成する
仮名漢字変換装置であって、所定の文節同士の係り受け
を構成する係り語と受け語の情報を、該係り語と受け語
の間に許される許容付属語の情報と共に記憶した係り受
け情報辞書と、前記入力された文字列を文節分かち書き
する処理を行なうとき、該係り受け情報を参照して、前
記係り語と受け語の情報に該当する単語を備えた文節を
検索する文節検索手段と、該係り語と受け語の情報に該
当する単語を備えた文節が検索されたとき、前記係り語
に付属する語が前記許容付属語であるかを判定する第１
の判定手段と、前記係り語と受け語の情報に該当する単
語を備えた文節が検索されたとき、該受け語に続く付属
語が使役もしくは受動を表わす語である場合には、前記
係り語に付属する語が、使役もしくは受動に対応した語
であるかを判定する第２の判定手段と、該第１および第
２の判定手段の判定結果に基づいて、前記文節分かち書
き候補を制限する文節候補制限手段とを備えたことを要
旨とする。A kana-kanji conversion device of the present invention inputs a kana character string, refers to a word dictionary, and scribes the inputted kana character string into bunsetsu segmentation candidates. A kana-kanji conversion device that creates a candidate character string that forms a kana-kanji mixed sentence using the bunsetsu segmentation candidate, and provides information on a dependent word and a dependent word that form a modification between predetermined phrases. The dependency information dictionary stored together with the information of the permissible ancillary words permitted between the dependency word and the dependent word, and when performing the process of segmenting the input character string into phrases, refer to the dependency information and A phrase search means for searching a phrase including a word corresponding to the information of the dependent word and the dependent word, and attached to the related word when the phrase including a word corresponding to the information of the related word and the dependent word is searched The word to do is before First determines whether the allowable comes word 1
When the phrase including the word corresponding to the information of the related word and the related word is searched, and the attached word following the received word is a word indicating causative or passive, the related word Second determining means for determining whether a word attached to the word is a word corresponding to causative or passive, and a clause for limiting the phrase segmentation candidate based on the determination results of the first and second determining means. The gist is that the candidate limiting means is provided.

【００１０】この仮名漢字変換装置に対応する仮名漢字
変換方法は、仮名文字列を入力し、辞書を参照して、該
入力された仮名文字列を文節分かち書きし、仮名漢字混
じり文字列候補を生成する仮名漢字変換方法であって、
前記入力された文字列を文節分かち書きする処理を行な
うとき、所定の文節同士の係り受けの情報を該係り語と
受け語の間に許される許容付属語の情報と共に記憶した
係り受け情報を参照して、係り受け情報に該当する単語
を備えた文節を検索し、該係り語と受け語の情報に該当
する単語を備えた文節が検索されて見い出されたとき、
前記係り語に付属する語が前記許容付属語であるかを判
定し、該受け語に続く付属語が使役もしくは受動を表わ
す語である場合には、前記係り語に付属する語が、使役
もしくは受動に対応した語であるかを判定し、該いずれ
かの判定結果に基づいて、前記文節分かち書きの候補を
制限するを要旨とする。A kana-kanji conversion method corresponding to this kana-kanji conversion device inputs a kana character string, refers to a dictionary, and scribes the kana character string input, and generates kana-kanji mixed character string candidates. Kana-Kanji conversion method
When performing the process of segmenting the input character string into bunsetsus, refer to the dependency information in which the information of the dependency between predetermined phrases is stored together with the information of the admissible adjunct word permitted between the dependency word and the received word. Then, a phrase having a word corresponding to the dependency information is searched, and a phrase having a word corresponding to the dependency word and the information of the dependency word is searched and found,
It is determined whether or not the word attached to the dependent word is the allowable attached word, and when the attached word following the received word is a word representing causative or passive, the word attached to the dependent word is a causative or The gist is to determine whether or not the word is passive and to limit the candidates for the phrase segmentation based on the determination result of any one of them.

【００１１】以上のように構成された本発明の第１の仮
名漢字変換装置および仮名漢字変換方法によれば、係り
受け情報辞書に所定の文節同士の係り受けを構成する係
り語と受け語の情報を両語の間に許される許容付属語の
情報と共に記憶しておき、入力された文字列を文節分か
ち書きする処理を行なうとき、この係り受け情報辞書に
記憶された係り受け情報を参照して、係り受け情報に該
当する単語を備えた文節を検索する。係り語と受け語の
情報に該当する文節が検索されたとき、係り語に付属す
る語が許容付属語であるかの判定、および受け語に続く
付属語が使役もしくは受動を表わす語である場合には、
前記係り語に付属する語が、使役もしくは受動に対応し
た語であるかの判定を行なう。この検索結果に基づい
て、文節分かち書きの候補を制限する。従って、係り受
けの情報が存在する場合、単純に両語の存在によって係
り受けの成立を見るのではなく、係り語に付属する語
が、両語の関係を許す場合に係り受けの成立とし、かつ
受け語が使役または受動を示す語である場合には、係り
語に付属する語が使役または受動に対応した語である場
合に、係り受けの成立として、文節候補の制限を行なう
から、文節分かち書きの非所望な候補は選択され難くな
り、所望の分かち書きがなされる可能性が高くなる。な
お、文節候補の制限に代えて、既に他の手法により推定
された文節分かち書きを前提として、各文節毎の漢字候
補の優先順位を変更するものとすれば、所望の漢字候補
を第１候補として得られる可能性を高めることができ
る。According to the first kana-kanji conversion device and kana-kanji conversion method of the present invention configured as described above, a dependency word and an adjective which constitute dependency of a predetermined phrase in the dependency information dictionary are provided. The information is stored together with the information of the permissible adjunct words allowed between both words, and when performing the processing of writing the input character string into phrase segments, refer to the dependency information stored in this dependency information dictionary. , Search for a phrase having a word corresponding to the dependency information. When a phrase corresponding to the information of the dependent word and the dependent word is searched, it is determined whether the word attached to the dependent word is an admissible adjunct word, and the adjunct word following the dependent word is a word indicating causative or passive. Has
It is determined whether the word attached to the related word is a word corresponding to causative or passive. Based on this search result, candidates for phrase segmentation are restricted. Therefore, when the dependency information exists, the dependency is not established simply by the existence of both words, but the dependency is established when the word attached to the dependency allows the relationship between the two terms, In addition, if the received word is a word indicating causative or passive, and if the word attached to the dependent word is a word corresponding to causative or passive, then the candidate candidates are restricted as the establishment of the dependent word. Undesired candidates for segmentation are less likely to be selected, and the possibility of desired segmentation is increased. If the priority of the kanji candidates for each bunsetsu is changed instead of the restriction of the bunsetsu candidates based on the bunsetsu segmentation already estimated by another method, the desired kanji candidate is set as the first candidate. The possibility of being obtained can be increased.

【００１２】ここで、前記辞書が、単語の読みと該読み
に対応する表記とを記憶した単語辞書であって、単語の
読みに対応する表記として、複数の表記が存在する場合
には、代表表記として定めた表記と、派生表記として定
めた表記とを記憶した表記情報部を備えるものとし、前
記文節検索手段を、前記単語辞書に記憶された代表表記
のみを用いて前記係り受けの検索を行なう手段とし、更
に、前記文節分かち書き候補については、前記単語辞書
の表記情報部に記憶された代表表記および派生表記を用
いて、変換後の候補文字列を表示する候補文字列表示手
段を備えるものとすることができる。Here, when the dictionary is a word dictionary storing a reading of a word and a notation corresponding to the reading, and when there are a plurality of notations corresponding to the reading of the word, a representative A notation defined as a notation and a notation information unit that stores a notation determined as a derivative notation are provided, and the phrase search means searches the dependency using only the representative notation stored in the word dictionary. And means for displaying the converted candidate character string using the representative notation and the derivative notation stored in the notation information section of the word dictionary for the phrase segmentation candidate. Can be

【００１３】この仮名漢字変換装置では、単語辞書に
は、代表表記と派生表記とが記憶されており、第１の仮
名漢字変換装置において文節分かち書きされた各文節に
ついての係り受けの検定については、代表表記のみを用
いて行なって単語を特定し、特定された単語についての
候補文字列の表示については、代表表記および派生表記
を用いて行なう。In this kana-kanji conversion device, the representative notation and the derivative notation are stored in the word dictionary, and regarding the dependency test for each phrase segmented in the first kana-kanji conversion device, The word is specified by using only the representative notation, and the candidate character string for the specified word is displayed using the representative notation and the derivative notation.

【００１４】更に、上記仮名漢字変換装置において、前
記文節候補制限手段に代えて、該第１および第２の判定
手段の判定結果に基づいて、前記文節毎の漢字候補の優
先順位を変更する漢字候補優先手段を備えるものとする
こともできる。この場合には、係り受けに該当しないも
のも漢字候補としては残されるが、漢字候補としては係
り受けが成立するものが優先されることになり好適であ
る。Further, in the above kana-kanji conversion device, instead of the phrase candidate limiting means, the kanji characters for changing the priority of the kanji candidates for each phrase based on the judgment results of the first and second judging means. Candidate priority means may be provided. In this case, those that do not correspond to the dependency are left as the Kanji candidates, but those that are dependent on the Kanji are prioritized, which is preferable.

【００１５】また、仮名漢字変換装置における文節検索
手段を、後方の文節を起点として、既に登録された検索
済み範囲を除いて、前方に向かって順次係り受け情報に
該当する単語を備えた文節を検索する遡行検索手段と、
該検索により係り受け情報に該当する単語を備えた文節
が見いだされたとき、該起点となった文節から該見いだ
された文節までを、係り受け情報の検索済み範囲として
登録する既検索範囲登録手段とを備えたものとすること
もできる。こうしておけば、次の検索時には、この範囲
は検索範囲から除かれるから、分かち書きのための検索
時間が短くて済み、更に鎖交した係り受けを誤って選択
するということがない。Further, the phrase searching means in the kana-kanji conversion device uses the backward phrase as a starting point and excludes the already-searched range that has already been registered, and sequentially obtains phrases including words corresponding to the dependency information toward the front. Retroactive search means to search,
When a phrase having a word corresponding to the dependency information is found by the search, an already-searched range registration means for registering the found phrase from the starting phrase to the found range of the dependency information It can also be equipped with. In this way, at the time of the next search, this range is excluded from the search range, so that the search time for separating words can be shortened, and further, the chained dependency will not be erroneously selected.

【００１６】上記仮名漢字変換装置における文節候補制
限手段を、前記第１または第２の判定手段が、該当する
係り語と受け語とその付属語の存在を判定したとき、該
語を含む文節分かち書きを優先的に選択する手段を備え
るものとすれば、係り受けが成立する分かち書きを優先
することになり、好適である。When the first or second judging means judges the existence of the relevant related word, the received word and its adjunct word, the phrase candidate limiting means in the kana-kanji conversion device described above is used to write a phrase segment containing the word. It is preferable to provide a means for preferentially selecting, because priority is given to the segmentation for which dependency is established.

【００１７】前記漢字候補優先手段を備えた仮名漢字変
換装置において、この漢字候補優先手段を、前記第１ま
たは第２の判定手段が、該当する係り語と受け語とその
付属語の存在を判定したとき、該語を含む文節分かち書
きを優先的に選択すると共に、該単語を仮名漢字変換の
第１候補として選択する手段を備えるものとすることが
できる。この場合には、分かち書きおよび第１候補の選
択が、係り受けに基づいてなされることになり、係り受
けの成立がもっとも優先されることになる。In the kana-kanji conversion device equipped with the kanji candidate priority means, the kanji candidate priority means is used by the first or second determination means to determine the existence of the relevant related word, the received word and its adjunct word. In this case, it is possible to provide means for preferentially selecting a phrase segmentation including the word and selecting the word as a first candidate for kana-kanji conversion. In this case, the segmentation and the selection of the first candidate are made based on the dependency, and the establishment of the dependency has the highest priority.

【００１８】更に、上記文節検索手段を、係り受け関係
を有する単語間に存在する補助的な語が、予め定めた特
定の文法構造を有する語である場合には、係り受け関係
は成立と判断する手段を備えるものとすれば、付属語に
関する情報量を低減することが可能となる。Further, when the auxiliary word existing between the words having the dependency relation is a word having a predetermined specific grammatical structure, the phrase searching means determines that the dependency relation is established. By providing a means for doing so, it becomes possible to reduce the amount of information about the attached word.

【００１９】また、文節検索手段を備えた仮名漢字変換
装置において、文節検索手段を、所定の文節を起点とし
て、前記係り受け辞書に記憶された係り受け情報を参照
して、係り受け情報に該当する単語を備えた文節を、該
起点とした文節に隣接する文節以外の文節まで検索する
隔文節検索手段と、前記隔文節検索手段により係り受け
の関係が見いだされたとき、前記起点となった文節から
該見いだされた文節までの範囲を、次の隔文節検索手段
による検索範囲から除外する検索範囲除外手段とを備え
るものとすることもできる。Further, in the kana-kanji conversion device equipped with the phrase search means, the phrase search means refers to the dependency information stored in the dependency dictionary starting from a predetermined phrase and falls under the dependency information. When a bunsetsu equipped with a word that is found by the bunsetsu bunsetsu search unit and a bunsetsu bunsetsu search unit that finds a bunsetsu having a word A range from a phrase to the found phrase may be provided with a search range excluding unit that excludes the range from the next alternate phrase searching unit.

【００２０】この場合には、係り受けの交差を排除して
正しい係り受けの誤判定を回避すると共に、係り受けの
検索の高速化を図ることが可能となる。In this case, it is possible to eliminate the crossing of the dependencies and avoid the erroneous determination of the correct dependencies and to speed up the search for the dependencies.

【００２１】なお、こうした仮名漢字変換装置におい
て、係り受け辞書を参照して係り受けの関係にある単語
を含む文節が見い出されなかった範囲については、既知
の文節分かち書きの手法を適用することができる。例え
ば、２文節最長一致法を用いても良いし、単語間または
／および文節間の結合の生じ易さに点数（コスト）を付
け、この結合の生じ易さが最大（最小コスト）となるよ
う単語または／および文節を選択するものとしても良
い。望ましくは、単語間の結合および文節間の結合の生
じ易さが最大となる組合わせを選択するよう構成すれば
よい。In such a kana-kanji conversion device, a known phrase segmentation method can be applied to a range in which a phrase including a word having a dependency relation is not found by referring to the dependency dictionary. . For example, the two-phrase longest matching method may be used, or a score (cost) is attached to the easiness of the word-to-word or / and bunsetsu-joining so that the easiness of this joining becomes maximum (minimum cost). Words and / or phrases may be selected. It is desirable to select a combination that maximizes the likelihood of word-to-word connection and bunsetsu-to-word connection.

【００２２】本発明の第２の仮名漢字変換装置は、仮名
文字列を入力し、該入力された仮名文字列を文節分かち
書きすると共に、入力された仮名文字列に対応した仮名
漢字混じり文を構成する候補文字列を生成する仮名漢字
変換装置であって、単語の読みと該読みに対応する表記
とを記憶した単語辞書であって、単語の読みに対応する
表記として、複数の表記が存在する場合には、代表表記
として定めた表記と、派生表記として定めた表記とを記
憶した表記情報部を備えた単語辞書と、所定の文節同士
の係り受けを構成する係り語と受け語の情報を、記憶し
た係り受け情報辞書と、前記単語辞書を用いて文節分か
ち書き処理を行なう分かち書き処理手段と、前記分かち
書きされた各文節について、前記単語辞書に記憶された
代表表記のみを用いて、前記係り受け情報辞書を用いた
係り受けの検定を行なって、該入力された文字列を構成
する単語を特定する文法処理手段と、該特定された単語
について、前記単語辞書の表記情報部に記憶された代表
表記および派生表記を用いて、変換後の候補文字列を表
示する候補文字列表示手段とを備えたことを要旨とす
る。A second kana-kanji conversion device of the present invention inputs a kana character string, writes out the input kana character string for each phrase, and forms a kana-kanji mixed sentence corresponding to the input kana character string. A kana-kanji conversion device for generating a candidate character string, which is a word dictionary storing a reading of a word and a notation corresponding to the reading, and there are a plurality of notations as the notation corresponding to the reading of the word. In this case, a word dictionary having a notation information section that stores the notation determined as the representative notation and the notation determined as the derivative notation, and the information of the dependent word and the received word that form the modification of predetermined phrases , A stored dependency information dictionary, a segmentation processing unit for performing segment segmentation processing using the word dictionary, and for each segment segmented, only the representative notation stored in the word dictionary is used. And a grammar processing unit that performs a dependency test using the dependency information dictionary to identify a word that constitutes the input character string, and a notation information unit of the word dictionary for the identified word. The gist of the present invention is to provide candidate character string display means for displaying the candidate character string after conversion, using the representative notation and the derived notation stored in.

【００２３】かかる仮名漢字変換装置に対応する仮名漢
字変換方法の発明は、仮名文字列を入力し、該入力され
た仮名文字列を文節分かち書きすると共に、入力された
仮名文字列に対応した仮名漢字混じり文を構成する候補
文字列を生成する仮名漢字変換方法であって、単語の読
みと該読みに対応する表記とを記憶した単語辞書に、単
語の読みに対応する表記として、複数の表記が存在する
場合には、代表表記として定めた表記と、派生表記とし
て定めた表記とを記憶し、係り受け情報辞書には、所定
の文節同士の係り受けを構成する係り語と受け語の情報
を記憶し、前記単語辞書を用いて文節分かち書き処理を
行ない、前記分かち書きされた各文節について、前記単
語辞書に記憶された代表表記のみを用いて、前記係り受
け情報辞書を用いた係り受けの検定を行なって、該入力
された文字列を構成する単語を特定し、該特定された単
語について、前記単語辞書に記憶された代表表記および
派生表記を用いて、変換後の候補文字列を表示すること
を要旨としている。The invention of a kana-kanji conversion method corresponding to such a kana-kanji conversion device is to input a kana character string, write the input kana character string into punctuations, and at the same time, enter a kana kanji character corresponding to the input kana character string. A kana-kanji conversion method for generating a candidate character string that forms a mixed sentence, wherein a plurality of notations are provided as a notation corresponding to the reading of a word in a word dictionary that stores the reading of the word and the notation corresponding to the reading. If it exists, the notation defined as the representative notation and the notation defined as the derivative notation are stored, and the dependency information dictionary stores the dependency word and the information of the dependency word that constitute the dependency of the predetermined clauses. Storing and performing phrase segmentation processing using the word dictionary, using only the representative notation stored in the word dictionary for each segment segmented, using the dependency information dictionary A dependency test is performed to identify the words that make up the input character string, and using the representative notation and the derivative notations stored in the word dictionary for the identified words, the converted candidate characters The main idea is to display columns.

【００２４】この仮名漢字変換装置および仮名漢字変換
方法では、係り受け情報辞書を参照して係り受け検定を
行なう際、単語辞書に登録された代表表記のみを用いて
検定を行なって、入力された文字列を構成する単語を特
定するが、この特定された単語の候補文字列を表示する
際には、単語辞書に記憶された代表表記のみならず、派
生表記を用いて表示を行なう。従って、係り受けを含む
文法処理の高速化と表記の多様性とを両立させることが
できる。In this kana-kanji conversion device and kana-kanji conversion method, when the dependency check is performed by referring to the dependency information dictionary, only the representative notation registered in the word dictionary is used for the validation and the input is performed. The words constituting the character string are specified. When displaying the candidate character strings of the specified words, not only the representative notation stored in the word dictionary but also the derivative notation is used for display. Therefore, it is possible to achieve both high-speed grammar processing including dependency and variety of notation.

【００２５】[0025]

【実施例】以上説明した本発明の構成・作用を一層明ら
かにするために、以下本発明の好適な実施例について説
明する。図１は、仮名漢字変換の制御ロジックを示すブ
ロック図、図２は、この仮名漢字変換制御ロジックが実
際に動作するハードウェアを示すブロック図である。図
２に示すように、この装置は、周知のＣＰＵ２１を中心
にバス３１により相互に接続された次の各部を備える。
ＣＰＵ２１とバス３１により相互に接続された各部につ
いて、簡単に説明する。Preferred embodiments of the present invention will be described below in order to further clarify the structure and operation of the present invention described above. FIG. 1 is a block diagram showing a control logic for kana-kanji conversion, and FIG. 2 is a block diagram showing hardware on which the kana-kanji conversion control logic actually operates. As shown in FIG. 2, this device is provided with the following parts centered around a well-known CPU 21 and mutually connected by a bus 31.
Each part mutually connected by the CPU 21 and the bus 31 will be briefly described.

【００２６】ＲＯＭ２２：仮名漢字変換プログラム等を
記憶するマスクメモリ、ＲＡＭ２３：主記憶を構成する読み出しおよび書き込み
が可能なメモリ、キーボードインタフェース２５：キーボード２４からの
キー入力を司るインタフェース、ＣＲＴＣ２７：カラーで表示可能なＣＲＴ２６への信号
出力を制御するＣＲＴコントローラ、プリンタインタフェース２９：プリンタ２８へのデータ
の出力を制御するインタフェース、ハードディスクコントローラ（ＨＤＣ）３０；ハードデ
ィスク３２を制御するインタフェース、である。ハードディスク３２には、ＲＡＭ２３にロード
されて実行される各種プログラムやデバイスドライバの
形式で提供される仮名漢字変換処理プログラム、あるい
はその仮名漢字変換処理プログラムが参照する各種変換
辞書などが記憶されている。ROM 22: Mask memory for storing kana-kanji conversion program, RAM 23: Readable and writable memory constituting main memory, Keyboard interface 25: Interface for controlling key input from keyboard 24, CRTC 27: Color display A CRT controller that controls the output of signals to the CRT 26 that is possible, a printer interface 29: an interface that controls the output of data to the printer 28, a hard disk controller (HDC) 30; an interface that controls the hard disk 32. The hard disk 32 stores various programs loaded and executed in the RAM 23, a kana-kanji conversion processing program provided in the form of a device driver, various conversion dictionaries referred to by the kana-kanji conversion processing program, and the like.

【００２７】こうして構成されたハードウエアにより、
文章が入力，仮名漢字変換，編集，表示，印刷などがな
される。すなわち、キーボード２４から入力された文字
列は、ＣＰＵ２１により所定の処理がなされ、ＲＡＭ２
３の所定領域に格納され、ＣＲＴＣ２７を介してＣＲＴ
２６の画面上に表示される。With the hardware thus configured,
Text is input, Kana-Kanji conversion, editing, display, printing, etc. That is, the character string input from the keyboard 24 is subjected to predetermined processing by the CPU 21, and the RAM 2
3 is stored in a predetermined area, and a CRT is sent via the CRTC 27.
It is displayed on the screen of 26.

【００２８】次に、こうして構成されたハードウエアに
より実行される機能を図１を用いて説明する。図１に示
した各部の構成と働きについて概説するが、ここで行な
われる処理は、キーボード２４より入力されたデータに
基づき、中央処理装置（ＣＰＵ２１）が実行するもので
ある。このＣＰＵ２１により、総ての処理がおこなわれ
る。仮名漢字変換については、キーボード２４が操作さ
れたとき、所定の割込処理が起動し、入力したキーイメ
ージを対応する仮名文字列に変換し、更にこれを仮名漢
字混じり文字列に変換するデバイスドライバが起動す
る。もとより、並列処理可能なコンピュータであれば、
仮名漢字変換を一つのアプリケーション（インプットメ
ソッド）が行なうものとし、変換結果を、必要とするア
プリケーションに引き渡す構成としても差し支えない。
この場合には、キーボード２４からの入力をインプット
メソッドが一括して引き受けることになる。Next, the function executed by the hardware thus configured will be described with reference to FIG. The configuration and operation of each unit shown in FIG. 1 will be briefly described. The processing performed here is executed by the central processing unit (CPU 21) based on the data input from the keyboard 24. The CPU 21 performs all processing. Regarding kana-kanji conversion, when the keyboard 24 is operated, a predetermined interrupt process is activated, the input key image is converted into a corresponding kana-character string, and further this is converted into a kana-kanji mixed character string. Will start. Of course, if the computer can process in parallel,
Kana-Kanji conversion may be performed by one application (input method), and the conversion result may be delivered to the required application.
In this case, the input method collectively accepts the input from the keyboard 24.

【００２９】キーボード２４からのキーイメージは、文
字入力部４０により受け付けられ、ここで、対応する仮
名文字列に変換される。ローマ字入力の場合には所定の
変換テーブルを参照して、仮名文字列に変換する。一つ
の仮名文字が得られる度に文字入力部４０は、その仮名
文字を変換制御部４２に送出する。この変換制御部４２
は、仮名漢字変換の中心的な役割を果たす所であり、後
述する種々の仮名漢字変換を制御して、結果を変換後文
字列出力部４４に送出する。変換後文字列出力部４４
は、現実には、ＣＲＴＣ２７に信号を送り、ＣＲＴ２６
に変換後文字列を表示する。The key image from the keyboard 24 is accepted by the character input unit 40, and converted into a corresponding kana character string here. When inputting romaji, it refers to a predetermined conversion table and converts into a kana character string. Every time one kana character is obtained, the character input unit 40 sends the kana character to the conversion control unit 42. This conversion control unit 42
Is a central part of kana-kanji conversion, controls various kana-kanji conversions described later, and sends the result to the converted character string output unit 44. Converted character string output unit 44
Actually sends a signal to the CRTC 27,
Display the converted string.

【００３０】変換制御部４２は、受け取った仮名文字を
文字列入力部５０に引き渡す。文字列入力部５０は、文
字格納部５２に仮名文字列を格納する。この文字列に基
づいて、自立語候補作成部５４と付属語候補作成部６４
とが、単語データの候補を作成する。自立語候補作成部
５４は、ハードディスク３２に予め記憶された自立語辞
書５８を用い、自立語解析位置管理部５６の管理の下
で、得られた仮名文字列から自立語候補を抽出する処理
を行なう。一方、付属語候補作成部６４は、同じく付属
語辞書６８を用い、付属語解析位置管理部６６の管理の
下で、得られた仮名文字列から付属語候補を抽出する処
理を行なう。解析位置を移動しつつ、自立語候補と付属
語候補を抽出する処理については、後述する。The conversion control unit 42 delivers the received kana characters to the character string input unit 50. The character string input unit 50 stores the kana character string in the character storage unit 52. Based on this character string, the independent word candidate creation unit 54 and the adjunct word candidate creation unit 64
And create word data candidates. The independent word candidate creation unit 54 uses the independent word dictionary 58 stored in advance in the hard disk 32, and under the management of the independent word analysis position management unit 56, performs processing for extracting independent word candidates from the obtained kana character string. To do. On the other hand, the adjunct word candidate creating unit 64 also performs the process of extracting an adjunct word candidate from the obtained kana character string under the control of the adjunct word analysis position managing unit 66 using the adjunct word dictionary 68. The process of extracting independent word candidates and adjunct word candidates while moving the analysis position will be described later.

【００３１】ここで、自立語辞書５８は、学習により、
同音異義語や接辞などの優先順位を変更する。この学習
処理を行なうのが、係り受け学習部７０，自立語学習部
７２，補助語学習部７４，接辞学習部７６，文字変換学
習部７８である。係り受け学習部７０は、係り受けが成
立する条件で、使用者が係り受けに該当する単語以外の
語を選択した場合、同じ単語の組合わせでは、使用者が
選択した組合わせを優先するよう係り受けの関係を学習
するものである。自立語学習部７２は、同音異義語の存
在する自立語群において、最後に選択された単語を最優
先の候補とするよう学習するものである。補助語学習部
７４は、例えば「ください」などの補助語を「くださ
い」「下さい」など、いずれの語形で変換するかを学習
するものである。更に、接辞学習部７６は、接頭語，接
尾語などの変換形式（例えば、「御」「ご」など）を学
習するものである。文字変換学習部７８は、入力した文
字列をそのままひらがなやカタカナとして確定させた場
合に、その文字列を学習し、次回以降の変換処理では確
定させたひらがなまたはカタカナを候補として出力する
ものである。Here, the independent word dictionary 58 is learned by learning.
Change the priority of homonyms and affixes. The learning process is performed by the dependency learning unit 70, the independent word learning unit 72, the auxiliary word learning unit 74, the affix learning unit 76, and the character conversion learning unit 78. When the user selects a word other than the word corresponding to the dependency under the condition that the dependency is satisfied, the dependency learning unit 70 gives priority to the combination selected by the user in the combination of the same words. This is to learn the relationship of dependency. The independent word learning unit 72 learns the word selected last in the independent word group in which homonyms exist as the highest priority candidate. The auxiliary word learning unit 74 learns in which word form an auxiliary word such as "please" is converted to "please" or "please". Further, the affix learning unit 76 is for learning conversion formats such as prefixes and suffixes (for example, "Go" and "Go"). The character conversion learning unit 78 learns the input character string as it is as hiragana or katakana and outputs the confirmed hiragana or katakana as a candidate in the subsequent conversion process. .

【００３２】自立語候補作成部５４，付属語候補作成部
６４により、作成された語候補を得て、単語データ作成
部８０が、各語候補についてのデータを作成する。即
ち、得られた自立語と付属語、自立語と自立語、更には
「自立語＋付属語」からなる文節間の接続を接続検定テ
ーブル８４を参照して接続検定部８２が行なった結果、
および全体のコスト計算をコスト計算部８６が行なった
結果を得て、単語毎のデータとして出力するのである。
この単語データは、一旦単語データ格納部１００に格納
され、係り受け候補調整部９０からの調整出力を受け
て、文節分かち書きの処理に用いられる。The independent word candidate creating unit 54 and the auxiliary word candidate creating unit 64 obtain the created word candidates, and the word data creating unit 80 creates data for each word candidate. That is, as a result of the connection verification unit 82 referring to the connection verification table 84, the connection between the obtained independent words and adjunct words, independent words and independent words, and further the connection between the clauses consisting of “independent words + adjunct words”,
And the cost calculation unit 86 obtains the result of the cost calculation performed by the cost calculation unit 86 and outputs the result as data for each word.
This word data is once stored in the word data storage unit 100, receives the adjustment output from the dependency candidate adjustment unit 90, and is used for the phrase segmentation writing process.

【００３３】係り受け候補調整部９０は、自立語候補作
成部５４，付属語候補作成部６４からの語候補を受け
て、係り受けの検定を行なうものである。係り受けの検
定は、ハードディスク３２に予め用意された係り受け辞
書９８を参照することによって行なわれる。係り受けの
検定を行なう範囲は、係り受け範囲管理部９６により管
理される。また、係り受けの関係の検定には、いくつか
の許容条件があり、これが使役・受動解析部９２，助詞
許容解析部９４等により判定される。以上の係り受けの
検定により調整された係り受け候補と、先に説明した単
語データとは、単語データ格納部１００により統合さ
れ、文節分かち書き部１０２による文節分かち書きの処
理に供される。文節分かち書き部１０２は、得られたデ
ータから文節分かち書きの第１候補を決定する。The dependency candidate adjusting section 90 receives the word candidates from the independent word candidate creating section 54 and the adjunct word candidate creating section 64 and tests the dependency. The modification check is performed by referring to the modification dictionary 98 prepared in advance on the hard disk 32. The range in which the dependency check is performed is managed by the dependency range management unit 96. In addition, there are some admissible conditions in the verification of the dependency relationship, and these are determined by the causative / passive analysis unit 92, the particle admission analysis unit 94, and the like. The dependency candidates adjusted by the above dependency test and the word data described above are integrated by the word data storage unit 100 and provided to the phrase segmentation and segmentation processing by the segment segmentation and segmentation unit 102. The segment segmentation writing unit 102 determines a first segment segmentation segmentation candidate from the obtained data.

【００３４】以上の処理により文節分かち書きの第１候
補と、その文節毎の仮名漢字変換の第１候補が決定され
る。文節分かち書き部１０２は、その候補を文節データ
格納部１０６に格納し、格納された候補は、変換文字列
出力部１０８により変換制御部４２に出力される。変換
制御部４２は、この文字列を候補文字列として表示する
と共に、非所望の文字列が候補となる場合もありえるか
ら、使用者による指示を受けて、次候補の表示や選択な
どの処理を行なう。これらの指示や選択の結果などは、
文節データ格納部１０６や既述した各学習部７０ないし
７８に入力され、文節の一部確定や学習による優先順位
の書き換えなどに用いられる。なお、図示していない
が、使用者により文字列の確定処理がなされると、各部
に一時的に保存されたデータは総て消去され、次の変換
に備える。By the above processing, the first candidate for segmentation and segmentation and the first candidate for kana-kanji conversion for each segment are determined. The phrase segmentation writing unit 102 stores the candidates in the phrase data storage unit 106, and the stored candidates are output to the conversion control unit 42 by the conversion character string output unit 108. The conversion control unit 42 displays this character string as a candidate character string, and since an undesired character string may be a candidate, the conversion control unit 42 receives instructions from the user and performs processing such as display and selection of the next candidate. To do. The results of these instructions and selections are
It is input to the phrase data storage unit 106 and each of the learning units 70 to 78 described above, and is used for partially fixing the phrase and rewriting the priority order by learning. Although not shown, when the character string is confirmed by the user, all the data temporarily stored in each unit are erased and prepared for the next conversion.

【００３５】以上、仮名文字の入力から変換語文字列の
出力までを概説したが、次に各処理の詳細について説明
する。まず最初に一般的な文節分かち書きの処理につい
て説明し、次に本発明の要部である係り受けの処理につ
いて説明する。図３は、最小コスト法による文節分かち
書きの処理の概要を示すフローチャートである。図示す
るように、まず、一時的に保存されたデータの消去や解
析位置を１桁目に初期化するなどの初期化の処理（ステ
ップＳ２００）を行なった後、解析位置を求める処理を
行なう（ステップＳ２１０）。解析位置とは、それまで
に入力された仮名文字列の先頭から順に一つずつ進めら
れていく位置である。例えば、図４に示す例文「くるま
ではこをはこぶ」という仮名文字列が入力されていると
すれば、最初の解析位置は１桁目の「く」の位置であ
る。この解析位置で、ハードディスク３２に記憶された
自立語辞書５８および付属語辞書６８を検索する処理を
行なう（ステップＳ２２０）。The process from the input of kana characters to the output of the conversion word character string has been outlined above. Next, the details of each process will be described. First, a general phrase segmentation process will be described, and then a dependency process, which is an essential part of the present invention, will be described. FIG. 3 is a flow chart showing an outline of the processing of phrase segmentation by the minimum cost method. As shown in the figure, first, initialization processing such as erasing of temporarily stored data and initialization of the analysis position in the first digit (step S200) is performed, and then processing for obtaining the analysis position is performed ( Step S210). The analysis position is a position where the kana character string input so far is advanced one by one from the beginning. For example, if a kana character string “Kuru ma wa ko wo ko bu ko” is input as in the example sentence shown in FIG. 4, the first analysis position is the position of “KU” in the first digit. At this analysis position, a process of searching the independent word dictionary 58 and the auxiliary word dictionary 68 stored in the hard disk 32 is performed (step S220).

【００３６】辞書の検索を行なった後、得られた単語に
ついてそれ以前の単語との結合をチェックする処理を行
ない（ステップＳ２３０）、単語間の結合がありえない
語しか得られていない場合には、更に辞書を検索する。
例えば、図４に示した例では、「こをはこぶ」の「は」
について付属語辞書６８から検索された係助詞の「は」
は、そのなど直前の格助詞「を」との結合がありえない
と判断されるから、単語データ作成部８０，接続検定部
８２による接続の検定により、無効なデータとして扱わ
れる。図４では、こうした結合チェックにより無効と判
断された語に符号「×」を付けた。なお、単語間の結合
は、接続検定テーブル８４に予め記憶されているが、こ
の接続検定テーブル８４は、単語の品詞同士の結合の可
能性についての情報を与えるテーブルであり、実施例で
は、４００×４００程度のマトリックスとして与えられ
ている。一つの解析位置での辞書検索と結合チェックが
終われば、解析位置を順に進めて更に処理を繰り返す。After the dictionary is searched, the obtained word is subjected to a process of checking the combination with the previous word (step S230). Further search the dictionary.
For example, in the example shown in FIG. 4, "ha" of "koohakobu"
The particle "ha" retrieved from the adjunct dictionary 68 for
Is determined to be impossible to combine with the immediately preceding case particle “o”, and is treated as invalid data by the connection test by the word data creation unit 80 and the connection verification unit 82. In FIG. 4, a word "x" is added to a word judged to be invalid by such a combination check. The connection between words is stored in advance in the connection verification table 84. This connection verification table 84 is a table that gives information about the possibility of combining the parts of speech of words. In the embodiment, 400 is used. It is given as a matrix of about × 400. When the dictionary search and the combination check at one analysis position are completed, the analysis positions are sequentially advanced and the processing is repeated.

【００３７】結合の可能性のある単語については、次に
コスト計算を行ない、その語の最小総コストを求める処
理を行なう（ステップＳ２４０）。この処理は、コスト
計算部８６が行なうもので、図４（Ａ）に示す例では、
「くるま」は、例えば「く」＋「る」＋「ま」、「く
る」＋「ま」「くるま」と分けることができ、これらに
単語を当てはめてゆくとき、自立語＝２、付属語＝０の
コストを持つものとし、「苦」（自立語）＋「流」（自
立語）ならば、「流」の総コストは４、と求めるもので
ある。この時、「間」のコストが４となるのは、最小の
総コストを求めるからであり、「苦」＋「流」＋「間」
のコスト６ではなく、「来る」＋「間」の場合のコスト
４を採用するからである。「で」「は」は付属語なの
で、それ以前の単語のうち最小のコストの単語「車」＝
２のコストがそれ自身のコストとなる。図４には、各語
のコストを右下に示した。For words that may be combined, the cost is calculated next, and the minimum total cost of the words is calculated (step S240). This processing is performed by the cost calculation unit 86, and in the example shown in FIG.
"Car" can be divided into, for example, "ku" + "ru" + "ma", "car" + "ma" and "car". When applying words to these, independence word = 2, attached word It is assumed that the cost is = 0, and if "bitterness" (independent word) + "stream" (independent language), the total cost of "stream" is 4. At this time, the cost of "pause" is 4 because the minimum total cost is obtained, and "pain" + "flow" + "pause"
This is because the cost 4 in the case of “come” + “between” is adopted instead of the cost 6 of. Since "de" and "ha" are subordinate words, the word "car" with the lowest cost among the words before it =
The cost of 2 is its own cost. In FIG. 4, the cost of each word is shown in the lower right.

【００３８】以上のコスト計算の後で、各単語のコスト
をチェックし、不適切なコストのものを無効とする処理
を行なう（ステップＳ２５０）。不適切なコストとは、
他の語の組合わせと比べてコストが大きくなってしまう
語の組合わせである。即ち、「区」＋「留」といった語
の組合わせを選択することは、その位置までで得られる
他の語「来る」や「繰る」のコストより高くなってしま
うので、不適切なコストと判断して、これを文節候補か
ら除外するのである。この最小コストの考え方から採用
されない語を、図４では、語の右上に「●」として示し
た。なお、図４において、「○」は、その語が、上述し
た結合チェックとコストチェックの結果、文節候補を形
成する可能性のある語として残ったものであることを示
している。After the cost calculation described above, the cost of each word is checked, and the process of invalidating the cost is performed (step S250). What is an inappropriate cost?
It is a combination of words that costs more than other combinations of words. That is, selecting a combination of words such as "ku" + "dome" is more expensive than other words "come" and "repeat" that can be obtained up to that position. Judgment is made and this is excluded from the phrase candidates. Words that are not adopted due to the idea of this minimum cost are shown as “●” in the upper right of the words in FIG. Note that, in FIG. 4, “◯” indicates that the word remains as a word that may form a phrase candidate as a result of the above-described combination check and cost check.

【００３９】次に、こうしてコストが与えられた単語候
補をリンクする処理を行なう（ステップＳ２６０）。即
ち、結合が有効とされた語について、その結合関係をポ
インタを設定することで関係づけるのである。図４の例
では、「来る」「繰る」「車」「まで」「で」「は」
「では」などが無効でない語として最小総コストの計算
がなされたから、「来る」「繰る」については「まで」
にリンクし、「車」については「で」「では」にリンク
するというように関係づけるのである。こうした結合チ
ェックやコスト計算、そしてリンクづけの処理を、一つ
の解析位置で総ての単語の検索が完了する間で繰り返
す。また、その解析位置での辞書の検索が完了すると、
更に解析位置を一つ進めて、新たな単語の成立を検討
し、同様に結合チェックやコスト計算などを繰り返す。Next, a process of linking the word candidates thus given the cost is performed (step S260). That is, with respect to the words for which the combination is valid, the connection relation is related by setting a pointer. In the example of FIG. 4, "come", "roll", "car", "up", "de", "ha"
Since the minimum total cost was calculated as a word such as "in" which is not invalid, "up" is "" for "come" and "iteru".
The relationship is such that "car" is linked to "de" and "de" is linked to "car". The processes of the combination check, the cost calculation, and the linking are repeated while all the words are searched at one analysis position. Also, when the dictionary search at the analysis position is completed,
Further advance the analysis position, examine the establishment of a new word, and repeat the join check and cost calculation in the same way.

【００４０】解析位置が、既に入力された最後の仮名文
字の位置に至り、全語について解析が完了した場合には
（ステップＳ２６５）、以上の処理を前提として、最小
コストのパスを検索する処理を行なう（ステップＳ２７
０）。この処理は、文節分かち書き部１０２が行なうも
ので、有効とされた語の組合わせのなかで、語に付与さ
れたコストの総和が最小になるものを検索する処理であ
る。「くるまではこをはこぶ」の例では、図４（Ｂ）に
実線Ｊのパスとして示すように、「車で」＋「箱を」＋
「運ぶ」という分かち書きが総コスト１８となるので、
最小コストとして選択される。なお、最小コストではな
いが、他の文節分かち書きの候補も検索される。例え
ば、図４（Ｂ）に破線Ｂのパスとして示すように、「車
では」＋「子を」＋「運ぶ」という分かち書き（コスト
＝２０）である。こうして分かち書きの候補を作成した
後（ステップＳ２８０）、今度は各文節の内部での候補
を作成する処理を行なう（ステップＳ２９０）。即ち、
ひとつの文節分かち書きの内部で、例えば「はこを」に
対して「箱を」や「函を」といった候補を用意するので
ある。これらの文節の候補や単語の候補は、使用者によ
り文節の分け方をかえるよう指示されたり、次候補を表
示するよう指示された場合に使用される。When the analysis position has reached the position of the last kana character that has already been input and the analysis has been completed for all words (step S265), the process of searching for the path with the minimum cost based on the above process. (Step S27
0). This processing is performed by the phrase segmentation writing unit 102, and is a processing of searching for a combination of valid words that minimizes the total cost given to the words. In the case of "Walking until you come", as shown by the path of the solid line J in Fig. 4 (B), "By car" + "Box" +
The total cost is 18 for the "carry" segment.
Selected as the lowest cost. It should be noted that although the cost is not the minimum, other phrase segmentation candidates are also searched. For example, as shown by a path indicated by a broken line B in FIG. 4B, the segmentation is “by car” + “child” + “carry” (cost = 20). After creating the segmentation candidate in this way (step S280), this time, the process of creating the candidate inside each phrase is performed (step S290). That is,
Within a bunsetsu syllabary, candidates such as "box" and "box" for "hako" are prepared. These bunsetsu candidates and word candidates are used when the user instructs to change the way of bunsetsu division or to display the next candidate.

【００４１】いま一つの文節分かち書きの例を図５に示
す。この例は、後述する係り受けの説明に用いるもので
あるが、係り受けを考慮しない最小コスト法による文節
分かち書きを、「きしゃをきしゃさせる」について適用
したものを示す。この例では、結合チェック（ステップ
Ｓ２３０）により、「ゃ」は名詞との直接の結合が無効
であることから除外される（×印）。また、「木」や
「氏」、あるいは「社」などは、最小総コストのチェッ
ク（ステップＳ２５０）から除外される（●印）。この
結果、図５に示した例では、「きしゃを」＋「きしゃ」
という文節分かち書きがなされ、各語の優先順位が図５
に示した順序であるとすれば、「きしゃ」の第１候補と
しては「貴社」が選ばれることになる。なお、後半の
「きしゃ」については、解析が「きしゃ」の末尾までま
でしか至っていない場合には、前半の第１候補と同一の
「貴社」が選ばれることになるが、「きしゃ」の後に
「させ」や「する」などが付属する場合には、例えば使
役「させ」が付属する語であることを考慮して「帰社」
を第１候補として表示することができる。FIG. 5 shows an example of another phrase segmentation. Although this example is used for the description of the dependency described later, the phrase segmentation by the minimum cost method that does not take the dependency into consideration is applied to “make a choke”. In this example, the combination check (step S230) excludes “ya” from the fact that the direct combination with the noun is invalid (marked with “x”). Further, “tree”, “Mr.”, “company”, etc. are excluded from the check of the minimum total cost (step S250) (marked with ●). As a result, in the example shown in FIG. 5, "kisha o" + "kisha"
Phrase segmentation is done and the priority of each word is shown in Figure 5.
If the order is as shown in (1), "your company" will be selected as the first candidate for "kisha." As for the latter half of “Kisha”, if the analysis reaches only the end of “Kisha”, the same “Your company” as the first candidate in the first half will be selected. When "sease" or "suru" is attached after "," for example, "return to work" considering that the word "sease" is attached
Can be displayed as the first candidate.

【００４２】なお、以上の説明では、コスト計算は、各
語自身についてのみ行なったが、実際には、単語同士の
結合のしやすさの度合いに応じてコストを下げるポイン
トを付与したり、文節同士の結合について文法的な規則
に基づいて同様に結合し易い文節同士の組合わせにコス
トを下げるポイントを付与することもできる。ここで
は、文節分かち書きの処理に対する理解の便を図って、
最も簡易な手法を用いて説明したに過ぎない。In the above description, the cost calculation is performed only for each word itself, but in reality, points are added to reduce the cost according to the degree of easiness of combining words, and clauses are added. It is also possible to add points for reducing the cost to the combination of clauses which are also easy to combine based on the grammatical rules regarding the connection between them. Here, for the convenience of understanding the processing of phrase segmentation,
It is explained using the simplest method.

【００４３】以上の文節分かち書きの処理を踏まえ、次
に係り受けによる文節分かち書きの処理について説明す
る。図６は、係り受け検定を行なう処理を取り出して示
すフローチャートである。この処理は、図３に示したス
テップＳ２２０ないしステップＳ２５０の処理と並行し
て実施される。実際には、解析位置を求めた後（ステッ
プＳ２１０）、各種辞書を検索する際、自立語辞書５
８，付属語辞書６８の検索に併せて、係り受け辞書９８
も検索し、結合チェック、最小総コストの計算に伴うコ
ストチェックと共に、次の係り受け検定処理がなされ
る。この処理が開始されると、まず、解析位置において
候補となり得る語（○印の語）について、係り受け候補
調整部９０が係り受け辞書９８を検索し、係り受け情報
が存在する語であるか否かの判断を行なう（ステップＳ
３００）。なお、解析位置における語が、接続詞、感動
詞、独立語の場合には、係り受けは存在しないとして、
その単語についての処理は直ちに終了する。Based on the above-described processing of segmentation and segmentation, the processing of segmentation and segmentation by dependency will be described below. FIG. 6 is a flow chart showing the process of performing the dependency check. This processing is performed in parallel with the processing of steps S220 to S250 shown in FIG. Actually, after obtaining the analysis position (step S210), when searching various dictionaries, the independent word dictionary 5
8. Dependent dictionary 98 along with the search of the attached word dictionary 68
Then, the following dependency check process is performed together with the join check and the cost check accompanying the calculation of the minimum total cost. When this process is started, first, the dependency candidate adjusting unit 90 searches the dependency dictionary 98 for words that can be candidates at the analysis position (words marked with a circle) to determine whether the dependency information exists. Determine whether or not (step S
300). If the word at the analysis position is a conjunction, a verb, or an independent word, there is no dependency,
The process for that word ends immediately.

【００４４】例として、「きしゃをきしゃさせる」とい
う仮名文字が入力されて、解析が「きしゃをきしゃ」ま
で進んだ場合を取り挙げて説明する。この時、後半の
「きしゃ」の候補としては、「記者」「貴社」「汽車」
「帰社」などが得られるから、これらの各語について、
係り受け辞書９８内に何らかの情報が存在するかを調べ
るのである。なお受け語となる語が「聞いた」や「利い
た」など用言であって活用形を有する場合には、語幹
「聞」や「利」あるいは基本形「聞く」や「利く」をキ
ーワードにして、係り受け辞書９８は参照可能に構成さ
れている。As an example, a case will be described in which the kana character "Kisosha Kisesha" is input and the analysis proceeds to "Kisasha Kisha". At this time, as a candidate for "Kissha" in the latter half, "reporter""yourcompany""train"
Since you can get "return to work" etc., for each of these words,
It is checked whether there is any information in the dependency dictionary 98. In addition, when the word to be received is a verb such as "heard" or "taku" and has an inflectional form, use the stems "hear" and "li" or the basic forms "hear" and "difficult" as keywords. Thus, the dependency dictionary 98 is configured so that it can be referred to.

【００４５】実施例における係り受け辞書９８の構造の
一例を図７に示す。本実施例の係り受け辞書９８は、
［見出し＋受け語（語幹）＋１つの係り語＋付属語情
報］を単位とする構造を持っており、図７の例では、受
け語「帰社」について、見出し「きしゃ」＋受け語「帰
社」＋係り語「記者」＋「が」、見出し「きしゃ」＋受
け語「帰社」＋係り語「貴社」＋「に」、見出し「きし
ゃ」＋受け語「帰社」＋係り語「汽車」＋「で」・・・
・・というように、一つの受け語について、複数の組み
のデータを持っている。更に、見出し「きしゃ」，受け
語「記者」については、係り語群「貴社，新聞，通信社
・・・」を構成する各語について、同様に、見出し「き
しゃ」＋受け語「記者」＋係り語「貴社」＋「の」など
のように、一つの係り語毎にデータを持っている。これ
らのデータは、受け語についての見出し語の五十音順に
並んでいる。もとより、他の語についても、同様の係り
受け情報が記憶されている。係り受け候補調整部９０
は、この係り受け辞書を検索し、該当する見出しおよび
受け語が存在する場合には、係り語の候補を辞書から取
り出して、係り受けの検定に供するのである。なお、こ
れらのデータは、見出しと受け語は同一であるから、デ
ータ群全体の頭に見出し語と受け語を用意し、係り語と
付属語の情報を、個々に用意するものとしても良い。FIG. 7 shows an example of the structure of the dependency dictionary 98 in the embodiment. The dependency dictionary 98 of this embodiment is
The structure has a unit of [heading + word (stem) + 1 related word + adjunct information]. In the example of FIG. 7, for the word "return", the heading "kisha" + word "return" ”+ Word“ reporter ”+“ ga ”, headline“ kisha ”+ word“ return to work ”+ word“ yousha ”+“ ni ”, headline“ kisha ”+ word“ return ”+ word“ train ” "+" De "...
・・ There are multiple sets of data for one word. Further, for the heading “Kisha” and the received word “reporter”, for each of the words that compose the related word group “Your company, newspaper, news agency ...” It has data for each related word, such as "+ related word" Your company "+" no ". These data are arranged in the order of the Japanese syllabary of the headword for the word. Of course, similar dependency information is stored for other words. Dependency candidate adjustment unit 90
Searches this dependency dictionary, and if the corresponding heading and the relevant word exist, takes out the candidate of the dependency word from the dictionary and uses it for the dependency verification. Since these data have the same headline and the same word, the headword and the word may be prepared at the head of the entire data group, and the related word and the related word information may be prepared individually.

【００４６】また、図７の例では、係り受けの検定を分
かりやすく示すため、最小限の情報のみ示したが、実際
の係り受け辞書９８は、「受け単語見出し＋係り単語見
出し」、「受け単語見出し長」、「受け単語漢字」、
「受け単語品詞」、「係り単語見出し長」、「係り単語
漢字」、「係り単語品詞」、「係り受け関係」などの詳
細な情報からなる。受け単語や係り単語の品詞は、係り
受けの成立と付属語の許容を検討するために必要であ
り、見出し長のデータは、辞書９８から高速に切り出し
を行なうのに必要である。Further, in the example of FIG. 7, only the minimum information is shown in order to clearly show the dependency check. "Word heading length", "received word kanji",
It includes detailed information such as "accept word part of speech", "engagement word head length", "engagement word kanji", "engagement word part of speech", "engagement relationship". The part-of-speech of the dependent word and the dependent word is necessary for considering the establishment of the dependent word and the admissibility of the dependent word, and the heading length data is necessary for the high-speed extraction from the dictionary 98.

【００４７】係り受けの情報が存在する語（以下、受け
語という）が見い出された場合には、次に、係り受けが
既に成立したとして登録された範囲を除き、前方に向か
って係り受けに対応する語（以下、係り語という）が存
在するか検索を行ない（ステップＳ３１０）、対応する
係り語があるか否かの判断を行なう（ステップＳ３２
０）。この時、係り語の検索は、最小総コストとなって
いる語のみならず、他の語についても行なわれる。い
ま、係り受け辞書９８には、図７に示したように、「記
者（が）帰社」、「貴社（に）帰社」、「貴社（の）記
者」、「汽車（で）帰社」という係り受けが記憶されて
いるものとする。ここで（）内の仮名は、係り受け関係
を有するとされる語の間に存在する可能性があるとして
許容されている付属語である。ステップＳ３００におい
て受け語となり得ると判断された「帰社」「記者」につ
いて、係り受け情報に受け語が存在するので、これらに
ついて各々係り語が存在するか判断すると、図５に示し
た例では、「帰社」については「きしゃ」という文字列
の候補である「記者」「貴社」「汽車」が該当すると判
断され、「記者」については「きしゃ」という文字列の
候補である「貴社」が該当すると判断される。ステップ
Ｓ３２０で、係り語が存在すると判断された場合には、
次に両語の間に存在する付属語が、係り受けの存在を許
容する語であるか否かの判断を行なう（ステップＳ３３
０）。When a word for which dependency information is present (hereinafter referred to as a "bearing word") is found, next, except for the range in which the dependency is already established, the dependency is moved forward. A search is made for a corresponding word (hereinafter referred to as a dependent word) (step S310), and it is determined whether or not there is a corresponding related word (step S32).
0). At this time, the related word is searched for not only the word having the minimum total cost but also other words. Now, in the dependency dictionary 98, as shown in FIG. 7, the relations of “reporter (return)”, “your company (return)”, “your reporter”, and “train (re) return” It is assumed that the receiver is stored. Here, the kana in () is an adjunct word that is permitted as possibly existing between words that have a dependency relationship. For "return to office" and "reporter" that are determined to be possible words in step S300, there is a word in the dependency information. Therefore, when it is determined whether or not there is a dependency word in each of these, in the example shown in FIG. For “return to work”, it is determined that the candidates for the character string “kisha” are “reporter”, “your company”, and “train”, and for “reporter”, the candidate for the character string “kisha” is “your company”. Is determined to be applicable. If it is determined in step S320 that the dependent word exists,
Next, it is judged whether or not the adjunct word existing between the two words is a word that allows the existence of the dependency (step S33).
0).

【００４８】助詞の許容解析は、係り受けのタイプによ
り定義された許容関係を見たしているかを判断するもの
であり、係り受けのタイプ毎に次の類型を持つ。［Ｉ］連用修飾型名詞＋助詞＋用言の場合の助詞格助詞「が」「から」「で」「と」「に」「へ」「よ
り」「を」「の」係助詞「は」用言連用形＋用言の場合名詞＋用言（助詞省略型）の場合の省略可能な助詞「が」「は」係助詞，副助詞［ＩＩ］連体修飾型名詞＋助詞＋名詞の場合の助詞「の」体言＋体言（並列）の場合の助詞「や」「と」用言連体形＋名詞の場合連体詞＋名詞の場合The adjective parsing of particles determines whether or not the admissibility relation defined by the type of dependency is observed, and has the following types for each type of dependency. [I] Consecutive modifier noun + particle + particle in case of adjective Case particle “ga” “kara” “de” “to” “ni” “e” “yori” “o” “no” particle “ha” In the case of the noun + noun + noun + noun + noun (particle abbreviation type), the optional particles "ga", "ha", and adverbs [II] adnominal modifier noun + particle + noun particle In the case of "no" body language + body language (parallel) "ya""to" in the case of conjunctions + nouns In the case of conjunctions + nouns

【００４９】即ち、係り受け関係にあると判断された２
つの語の関係が上記のないしのいずれかに属すると
して、係り受け関係にある両語の間に存在する付属語
（大部分は助詞もしくは助詞的表現）が上記のいずれか
に該当する場合は、係り受け辞書９８には係り受け関係
を有する語について許容する助詞の設定がなされている
から、これを検定するのである。例えば、「機転」と
「利く」との間の係り受けが助詞の許容設定（の・が）
を伴っている場合、上記のケース（名詞＋助詞＋用
言）に属するから、「の」「が」は両語間に存在可能で
あるけれども（機転が利いた、機転の利いた→○）、他
の格助詞「から」「で」などは許容できない（機転から
利いた、機転で利いた→×）ということになる。That is, it is determined that the dependency relationship is 2
If the relation between two words belongs to any of the above or any of the above, and if the adjunct word (mostly a particle or particle-like expression) that exists between the two words in the dependency relationship falls under any of the above, The dependency dictionary 98 is set with a particle that allows a word having a dependency relationship, and this is verified. For example, the dependency between "tact" and "taku" is the setting of the particle allowance (no ・ ga)
If it is accompanied by, it belongs to the above case (noun + particle + noun), so "no" and "ga" can exist between both words (quickly, witty → ○) , Other case particles such as "kara" and "de" are unacceptable.

【００５０】ないしの各関係について、そこに挙げ
られたもの以外については、許容されると判断する。こ
の許容されると判断する例を以下に列挙するが、これら
は、係り受けとしては実際の表現としては成り立たない
場合を含む可能性がある。しかし、係り受けは、実際の
人間の言語活動としては、広い概念であり、あまりに厳
格な係り受けの取り決めはむしろ現実にそぐわないこと
が多い。また、余りに厳密な係り受けの取り決めは係り
受け辞書９８のいたずらな増大を招くだけであり、係り
受け検定の速度も低下させる。そこで、本実施例では、
付属語の許容について、係り受けの生じる関係をから
に分け、その中で許容・非許容の明確なものについて
は、係り受け辞書に許容するものとして係り受け関係の
成り立つ語と共に記憶し、それ以外については、許容す
るものとしたのである。Regarding each of the relationships (1) to (4), it is determined that the relationships other than those listed therein are allowed. The examples of judging that this is permissible are listed below, but these may include cases in which the dependency does not hold as an actual expression. However, dependency is a broad concept as an actual human language activity, and too strict dependency arrangements are often not practical. Further, too strict dependency arrangement only causes an unnecessarily large increase in the dependency dictionary 98, and also reduces the speed of dependency verification. Therefore, in this embodiment,
Regarding the admissibility of ancillary words, the relationship in which the dependency is generated is separated from that, and if there is a clear allowance / non-allowance, it is stored in the dependency dictionary along with the words that have the dependency relationship, and otherwise. With regard to

【００５１】［ＩＩＩ］許容される表現−連用修飾形の
場合・名詞＋格助詞的表現＋用言における格助詞的表現「ずつ」「として」「のため」「において」「によっ
て」など、・名詞＋係助詞＋用言における係助詞「こそ」「さえ」「しか」「でも」「も」など、・名詞＋副助詞＋用言における副助詞「きり」「くらい」「ずつ」「だけ」など、・名詞＋副助詞的表現＋用言における副助詞的表現「なので」「なら」など、・用言＋助詞＋用言における助詞「のは」など・接続助詞「ので」「から」「から」「て」など、・接続助詞的表現「からには」「ためには」「ほど」
「うえ」など、・用言＋用言を並列させる表現「か」「し」「たり」
「と同時に」など、[III] Permissible expression-in the case of continuous modified form ・ Noun + case particle expression + case particle expression in a noun "each""as""for""in""by", etc. Noun + particle + particle particle in adjectives “koso,” “sae,” “shika,” “but,” “mo,” etc. ・ Noun + auxiliary particle + auxiliary particle in adjectives “cut”, “about”, “gatsu”, “only” Etc. ・ Noun + Adverbial expression + Adverbial expression in adjectives "because", "Nara", etc. ・ Adjective + Particles + Particles in adjective "Noha", etc. "Kara", "te", etc. ・ Connective particle-like expressions "kara ni""wa" and "ho"
"Ue", etc. ・ Expression in which verbs and verbs are arranged in parallel "ka""shi""tari"
"At the same time", etc.

【００５２】［ＩＶ］許容される表現−連体修飾形・名詞＋助詞的表現＋名詞における助詞的表現「における」「に関する」「に基づいて」など、・用言＋助詞的表現＋名詞における助詞的表現「ための」「といった」「に伴う」「などの」「ごと
き」など、・体言＋体言を並列させる表現「か」。[IV] Permissible expressions-adjective modifiers-Noun + particle-like expressions + particle-like expressions in nouns "in", "relating", "based on", etc. Expressions such as “for”, “to”, “to accompany”, “to”, “toki”, etc.

【００５３】以上の規則に従って、係り受けの関係が見
い出された２つの語の間の付属語の許容について判断す
る。例として挙げた「記者」「帰社」の場合には、許容
される格助詞は「が」であるから、「きしゃをきしゃ」
については係り受けの成立が認められない。そこで、こ
れを判定し（ステップＳ３４０）、係り受けを成立させ
る係り語と受け語か存在するするにもかかわらず、係り
受けが成立しないと判断された場合には、次に使役・受
動の係り受けの検定処理を行なう（ステップＳ３４
２）。In accordance with the above rules, the admissibility of the adjunct word between the two words for which the dependency relationship is found is determined. In the case of "reporter" and "return to work" given as an example, the allowable case particle is "ga", so "kisha ki kisha"
Regarding the above, the acceptance of the dependency cannot be recognized. Therefore, this is determined (step S340), and if it is determined that the dependency is not established despite the presence of the dependent word and the dependent word that establish the dependency, then the causative / passive relationship is determined. Acceptance verification processing is performed (step S34).
2).

【００５４】使役・受動の係り受け検定処理は、図１に
示した使役受動解析部９２により行なわれる。この処理
について詳しく説明する。図５に示した文例では、更に
解析が図８に示すように「きしゃをきしゃさせ」まで進
むと、使役・受動であると判断でき、使役・受動の場合
を考慮した係り受け処理を行なうことになる（ステップ
Ｓ３４２）。この処理は、ステップＳ３２０，Ｓ３３０
と同様に、対応する係り語があるかと言う点とその場合
の付属語が許容される語であるかと言う判断である。
「きしゃ」に対して「帰社」に着目すると、対応する語
「記者」は存在し、次に付属語の解析を行なうと、使役
の場合には、本来の付属語「が」については「を」が許
容されることが予め記憶されているから、係り受けが成
立すると判断することになる。なお、「帰社」と「貴
社」との関係は、本来許容される付属語が「に」であ
り、使役の場合であっても「を」が許容される関係では
ないので、係り受けの成立は認められない。同様に「帰
社」と「汽車」＋「で」についても係り受けの検定を行
ない、係り語と受け語との間に使役であることにより許
容される付属語は存在しないことが分かる。The causative / passive dependency verification process is performed by the causative / passive analyzing unit 92 shown in FIG. This process will be described in detail. In the example sentence shown in FIG. 5, when the analysis further progresses to “Kissakusha” as shown in FIG. 8, it can be determined that it is a causative / passive, and the dependency processing considering the causative / passive case is performed. It will be performed (step S342). This process is performed in steps S320 and S330.
Similarly, there is a point that there is a corresponding related word and a judgment that an adjunct in that case is an allowable word.
Focusing on “returning” to “kisha”, the corresponding word “reporter” exists, and when the adjunct is analyzed next, in the case of causative, the original adjunct “ga” is “ Since it is stored in advance that "is allowed", it is determined that the dependency is established. Note that the relationship between "return to work" and "your company" is that the originally admissible adjunct is "ni", and "wo" is not allowed even in the case of causative, so the relationship is established. It is not allowed. Similarly, "return to office" and "train" + "de" are also tested for dependency, and it can be seen that there is no adjunct that is allowed as a causative between the term and the word.

【００５５】そこで、これらの解析結果を基に係り受け
の成立について判断し（ステップＳ３４４）、使役・受
動を考慮して係り受けが成立していると判断された場合
には、ステップＳ３４０で通常の係り受けが成立してい
ると判断された場合と共々、優先的にその語を含んだ文
節を、最小総コストの違いを越えて文節候補とする処理
を行なう（ステップＳ３５０）。更にこうして見い出さ
れた受け語から係り語までの間を係り受け成立済み範囲
として登録し、これを管理する処理を行ない（ステップ
Ｓ３６０）、全範囲について係り受けの検索を行なった
か否かの判断（ステップＳ３７０）に進む。なお、通常
の係り受けはもとより使役・受動を考慮しても係り受け
の成立が否定された場合には、ステップＳ３５０，３６
０を行なわず、ステップＳ３７０に移行する。Therefore, it is judged based on these analysis results whether or not the dependency is established (step S344), and if it is determined that the dependency is established in consideration of causative / passive, then normally in step S340. Along with the case where it is determined that the dependency is satisfied, the phrase including the word is preferentially processed as a phrase candidate over the difference in the minimum total cost (step S350). Further, the range from the dependent word to the dependent word thus found is registered as the dependent completed range, and a process for managing this is performed (step S360), and it is determined whether or not the dependent search is performed for the entire range ( It proceeds to step S370). In addition, if the establishment of the dependency is denied in consideration of the causative / passive as well as the normal dependency, steps S350 and S36 are performed.
Without 0, the process proceeds to step S370.

【００５６】係り受けを、受け語から前方に検索して、
検索済みとして登録された範囲を除いて総ての語につい
て完了するまで、上記の処理（ステップＳ３１０ないし
３７０）を繰り返し、全範囲についての検索が完了する
と、次に受け語についての複数の候補について、係り受
けの検定が完了したか否かの判断を行なう（ステップＳ
３８０）。即ち、この例では、受け語となる後半の「き
しゃ」についての候補「貴社」「帰社」「記者」「汽
車」などについて、総て係り受けの関係が成立するもの
があるか、検定するのである。係り受けの関係が成立す
る語が見い出され、付属語の許容解析もパスし、係り受
けが成立したと判断された語は、文節候補として最も高
い優先順位に設定される（ステップＳ３５０）。複数の
候補単語について係り受けの関係が成立した場合には、
辞書に登録されていた順に優先順位の高い文節候補とす
る。Searching for dependency in front of the received word,
The above process (steps S310 to 370) is repeated until all the words except the range registered as searched have been completed, and when the search for the entire range is completed, next, for the plurality of candidates for the received word, , It is determined whether the dependency test is completed (step S
380). In other words, in this example, the candidate "Your company", "Return to office", "Reporter", "Train", etc. for the latter half of "Kisha", which is the catchphrase, are tested to see if there is any relationship in which the dependency relationship is established. Of. A word for which the dependency relation is established is found, and the admissible analysis of the adjunct word is also passed, and the word for which the dependency is established is set to the highest priority as a phrase candidate (step S350). If a dependency relationship is established for multiple candidate words,
The bunsetsu candidates with the highest priority order are registered in the dictionary.

【００５７】ここで、係り受けの成立した語を含む文節
を文節候補とする際、その文節が最小総コストとなって
いない語を含む文節であっても優先されるという点につ
いて説明する。「きしゃをきしゃさせ」の例では、選択
される文節「きしゃを」「きしゃさせ」は、係り受けに
よる検定を行なわない最小総コスト法による文節候補
と、文節の分け方自体は同じである。しかし、例えば、
「じょうききしゃをきしゃさせ」という仮名文字列が入
力され、「蒸気汽車」という自立語が存在したと仮定す
ると、図９に示すように、「蒸気汽車を」「帰社させ」
が最小コストのパス（実線Ｇ）となって第１候補となっ
てしまう。これに対して、係り受け関係（「記者（が）
帰社」）の使役・受動による検定がなされた場合には、
最小コストのパスとはならない「上記」「記者を」「帰
社させ」が第１候補とされる（図９破線Ｂ）。Here, it will be explained that, when a bunsetsu including a word for which dependency has been established is set as a bunsetsu candidate, the bunsetsu including a word for which the bunsetsu does not have the minimum total cost is prioritized. In the example of "Kissha let", the selected phrases "Kishasho" and "Kishashase" are Is the same. But for example,
Assuming that the kana character string "Jokekishakase" was input and the independent word "steam train" existed, as shown in FIG. 9, "steam train" and "return home"
Becomes the path with the minimum cost (solid line G) and becomes the first candidate. On the other hand, the dependency relationship (“reporter
Returning to the office ”)
The first candidate is “above”, “reporter”, or “return to work” that does not become the path with the lowest cost (broken line B in FIG. 9).

【００５８】最初の例文「きしゃをきしゃさせる」につ
いて、単語間の接続チェックなどを行なって接続し得な
い候補を削除して最終的に得られた文節候補を図１０に
示す。従って、この文節分かち書きの第１候補は、「記
者を帰社させる」となる。FIG. 10 shows the bunsetsu candidates finally obtained by deleting the candidates that cannot be connected by checking the connection between words in the first example sentence "Kissha wa masashi". Therefore, the first candidate for this phrase segmentation is "return reporter to work".

【００５９】以上使役表現の一例について説明したが、
その類型としては、名詞Ｎ１，動詞Ｐと表記するものと
して、以下のものがある。「Ｎ１を＋Ｐさせる／せる」が、「Ｎ１が＋Ｐ」の使役
型「Ｎ１に＋Ｐさせる／せる」が、「Ｎ１に＋Ｐ」の使役
型「Ｎ１＋Ｐさせる／せる」が、「Ｎ１＋Ｐ」の使役型The example of the causative expression has been described above.
As the types thereof, there are the following as notation N1 and verb P. "N1 + P / make" is a causative type of "N1 is + P""N1 makes + P / cause" is a causative type of "N1 + P""N1 + P / cause" is a causative of "N1 + P" Type

【００６０】なお、本実施例では、２文節以上に亘る助
詞検定必要な下記のような使役の型は許容しない。「Ｎ１に＋対して＋Ｐ（動詞未然形）させる」「Ｎ１を＋Ｐ（形容動詞）に＋する」「Ｎ１を＋Ｐ（形容詞）く＋する」「Ｎ１を＋Ｐ（名詞）に＋する」「Ｎ１を＋Ｐ（動詞終止形）ように＋する」「Ｎ１を＋Ｐせしめる」In this embodiment, the following causative types that require a particle test over two or more clauses are not allowed. "+ P (adjective verb) to N1""add N1 to + P (adjective verb)""add N1 to + P (adjective)""add N1 to + P (noun)""N1" To + P as in + P (verb end form) "" Make N1 + P "

【００６１】更に、受動の場合の係り受けの処理につい
て例示する。係り受けとして「生徒（を）教える」が存
在する場合に、入力した仮名文字列「せいとがおしえら
れる」を文節分かち書きするばあいの処理を例にとって
説明する。図１１は、「せいとをおしえ」まで解析位置
が進み、「教え」について、前方に遡って係り語が存在
するかを検索する場合を示している。「教え」を受け語
とする係り受けはもとより「生徒（を）教える」だけで
はなく、「数学（を）教える」とか「先生（が）教え
る」なども存在するが、これらは係り受け辞書９８に登
録されており、「教」を見出しとして検索することがで
きる。この検索は、前方に遡ってなされるから、「緒」
「尾」から検定が開始され、「聖徒が」「生徒が」に至
って、「教」を見出しとする係り受けの中の「生徒
（を）教える」の「生徒」を見い出すことになる。この
係り受けは、そのままでは「生徒が教え」なので付属語
の許容解析をパスせず、係り受けの検定は一旦打ち切ら
れる。その後、文節分かち書きの検定が進んで「おしえ
られ」まで至って、再度係り受けの検定がなされると、
使役・受動の係り受けの検定により初めて付属語の許容
解析をパスする。従って、受動の場合の係り受けとして
成立と判断され、「生徒が」と「教えられ」とが文節分
かち書きの第１候補となる。この様子を図１２に示す。
得られる第１候補は、「生徒が教えられ」となる。ここ
で、「生徒が」から「教えられ」までは、係り受けの成
立範囲として、その後の係り受けの検索範囲からは除外
される。Further, the processing of the dependency in the passive case will be exemplified. In the case where there is "teaching (study)" as a dependency, the process of writing the inputted kana character string "Seito ga Shieru" is segmented into phrases will be described as an example. FIG. 11 shows a case in which the analysis position advances to “Seito Oshige” and “teaching” is searched backward to find whether a related word exists. There are not only "teaching" as a spoken word but also "teaching" as well as "teaching mathematics" and "teaching", but these are dependent dictionaries 98 It is registered in and you can search for "Kyou" as a headline. Since this search is performed backwards,
The examination starts from the "tail" and reaches "Saints" and "Students", and finds "Students" of "Teach (teaching)" in the dependency with "Teaching" as the headline. Since this dependency is “teaching by students” as it is, it does not pass the admissible analysis of the attached word, and the examination of the dependency is temporarily terminated. After that, when the bunsetsu segmentation test progressed and it reached "Tell me", and the dependency test was made again,
Pass the admissibility analysis of the attached word for the first time by the test of causative / passive dependency. Therefore, it is determined that the dependency is established in the passive case, and "student" and "taught" are the first candidates for the phrase segmentation. This state is shown in FIG.
The first candidate obtained is "student taught". Here, the range from "student" to "taught" is excluded from the subsequent range of dependency search as a range of dependency formation.

【００６２】以上受動表現の一例について説明したが、
その類型としては、名詞Ｎ１，動詞Ｐと表記するものと
して、以下のものがある。「Ｎ１が＋Ｐられる／れる」が、「Ｎ１を＋Ｐ」の受動
型「Ｎ１に＋Ｐられる／れる」が、「Ｎ１が＋Ｐ」の受動
型「Ｎ１＋Ｐせれる／れる」が、「Ｎ１＋Ｐ」の受動型The example of the passive expression has been described above.
As the types thereof, there are the following as notation N1 and verb P. "N1 is + P" / "N1 + P" is passive type "N1 is + P" / "N1 is + P" passive type "N1 + P is / P" is "N1 + P" Passive

【００６３】なお、本実施例では、２文節以上に亘る助
詞検定必要な下記のような受動の型は許容しない。「Ｎ１に（によって）＋Ｐられる」「Ｎ１に（により）＋Ｐられる」「Ｎ１が＋Ｎ２に＋Ｐられる」「Ｎ１の＋Ｐられる」「Ｎ１は＋Ｎ２に＋Ｐ（さ）れる」「Ｎ１から＋Ｐ（さ）れた＋Ｎ２」In the present embodiment, the following passive types which require a particle test over two or more clauses are not allowed. "N1 is (+) + P""N1 is (+) + P""N1 is + P to + N2""N1 is + P""N1 is + P to + N2""N1 to + P (to)" + N2 ”

【００６４】以上説明した本実施例によれば、単語のコ
ストを計算して文節分かち書きの候補を求める処理の過
程で同時に使役・受動の場合を含む係り受け情報も検索
しているので、文節分かち書きの候補を求める段階で、
使役・受動を含む係り受けの情報を反映させることがで
きる。係り受けの情報は、高次の言語活動なので、単語
間や文節間のコスト計算による文節分かち書きの選択の
画一性による弊害を回避して、より使用者の意図に沿っ
た文節分かち書きの候補を求めることが可能となる。し
かも、自立語辞書５８や付属語辞書６８を参照して行な
われる最小コスト法による文節分かち書きの処理と同時
に係り受けの処理もなされるから、係り受けの情報を用
いた文節分かち書きの処理を短時間の内に完了すること
ができる。文節分かち書きを済ませてから改めて係り受
け辞書９８を参照しにゆく場合には、係り受けの情報を
用いて文節の分け方を変更することができないばかり
か、辞書の参照を再度行なうので、処理に時間を要す
る。According to the present embodiment described above, since the dependency information including the causative / passive case is also retrieved at the same time in the process of calculating the cost of words and obtaining the candidates for phrase segmentation, segment segmentation is performed. At the stage of seeking candidates for
It is possible to reflect the information of the dependency including the causative and passive. Since the dependency information is a high-level language activity, it avoids the adverse effect of the uniform selection of phrase segmentation by cost calculation between words and phrases, and selects candidates for segmentation segmentation more in line with the user's intention. It becomes possible to ask. Moreover, since the processing of bunsetsu segmentation by the minimum cost method performed by referring to the independent word dictionary 58 and the adjunct dictionary 68 is performed at the same time as the dependency processing, the processing of segment segmentation using the dependency information is performed in a short time. Can be completed within. When the dependency dictionary 98 is referred to again after the bunsetsu segmentation has been completed, not only the way of dividing the bunsetsu cannot be changed using the dependency information, but also the dictionary is referenced again, so that the processing is performed. It takes time.

【００６５】また、係り受けが一旦成立したと判断され
た場合には、その受け語から係り語までの範囲を係り受
け成立範囲として、その後の検索範囲から除外するの
で、係り受けの範囲が交差することがない。また、２以
上の受け語が一つの係り語を受けるという判断をするこ
ともない。また、係り受けの成立を隣接する文節を越え
て判断するので、副詞などによる修飾が係り受け関係の
間に入っても係り受けの検定を正しく行なうことができ
る。従って、複数の係り受けが成立する場合には、図１
３（Ａ）に示すように、独立した係り受けが別個に成立
する組合わせか、図１３（Ｂ）に示すように、一つの受
け語が２以上の係り語を受ける組合わせか、図１３
（Ｃ）に示すように、一つの係り受けを跨ぐようにもう
一つの係り受けが成立する組合わせが許されることにな
る。Further, when it is determined that the dependency is established once, the range from the received word to the dependency word is set as the dependency establishment range and is excluded from the subsequent search range, so that the dependency range intersects. There is nothing to do. Moreover, it is not judged that two or more dialects receive one related word. Further, since the establishment of the dependency is judged across the adjacent clauses, the dependency test can be correctly performed even if the modifier such as an adverb enters the dependency relation. Therefore, when a plurality of dependencies are satisfied,
As shown in FIG. 3 (A), a combination in which independent dependencies are separately established, or as shown in FIG. 13 (B), a combination in which one word receives two or more dependencies, or FIG.
As shown in (C), a combination in which another dependency is established so as to straddle one dependency is allowed.

【００６６】次に、本発明の第２実施例について説明す
る。第２実施例では、第１実施例と同様のハードウェア
構成を用い、その機能ブロックも図１に示すものとほぼ
同一である。機能ブロックにおいて異なるのは、自立語
辞書５８の構造と、係り受け辞書９８の構造であり、辞
書構造の相違に伴う単語検索処理，係り受け検定処理お
よび表示処理である。第２実施例における処理に従っ
て、これらの相違点および辞書構造の相違について順次
説明する。Next, a second embodiment of the present invention will be described. The second embodiment uses the same hardware configuration as that of the first embodiment, and its functional blocks are almost the same as those shown in FIG. The functional blocks differ from each other in the structure of the independent word dictionary 58 and the structure of the dependency dictionary 98, that is, the word search process, the dependency test process, and the display process due to the difference in the dictionary structure. These differences and differences in the dictionary structure will be sequentially described according to the processing in the second embodiment.

【００６７】図１４は、第２実施例における仮名漢字変
換処理ルーチンを示すフローチャートである。この処理
ルーチンは、キーボード２４から一ないし複数の仮名文
字が入力された後、変換キー（例えば「スペースキ
ー」）が押されたとき、開始される処理である。なお、
変換キーが操作されなくても所定数の仮名文字が入力さ
れたとき、あるいは「。」や「、」「．」などの区切り
記号が入力されたときに、図１４の仮名漢字変換処理が
開始されるものとしても差し支えない。この処理が開始
されると、まず単語検索処理（ステップＳ４００）と分
かち書き処理（ステップＳ４２０）とが行なわれる。こ
れらの処理は、第１実施例における図３の処理に該当す
る処理である。FIG. 14 is a flow chart showing a kana-kanji conversion processing routine in the second embodiment. This processing routine is a processing which is started when a conversion key (for example, "space key") is pressed after one or a plurality of kana characters are input from the keyboard 24. In addition,
The Kana-Kanji conversion process of FIG. 14 starts when a predetermined number of Kana characters are input or when a delimiter such as “.”, “,” Or “.” Is input even if the conversion key is not operated. It does not matter if it is done. When this process is started, first, a word search process (step S400) and a segmentation process (step S420) are performed. These processes are processes corresponding to the processes of FIG. 3 in the first embodiment.

【００６８】図１５に、単語検索処理ルーチンの詳細を
示す。図示するように、単語検索処理ルーチンが起動さ
れると、まず単語検索の開始位置Ｍを値１、即ち入力さ
れた仮名文字列の先頭位置とする処理を行なう（ステッ
プＳ４０２）。次に、単語検索における読みの長さを示
す変数Ｌを値１に初期化する処理を行ない（ステップＳ
４０４）、この読みの長さＬの語を自立語辞書５８，付
属語辞書６８から検索する処理を行なう（ステップＳ４
０５）。ここで、自立語辞書５８は、図１６に示すよう
に、ヘッダとインデックスと辞書本体からなる。ヘッダ
は、辞書自体を管理するための情報である。インデック
スおよび辞書本体は、基本単語と派生単語と意味用例と
に分けて管理されている。基本単語とは、一つの単語が
派生表記を有する場合、例えば「取り扱い」に対して
「取扱」や「取扱い」などが表記として許されている場
合、これらの表記を代表する単語として予め定められた
単語である。即ち、基本単語とは、文節分かち書きや係
り受けの処理において代表的に用いられる単語を意味し
ているに過ぎない。単語辞書に記録されている語である
ため、代表単語と呼ぶが、言語における基本的に単語と
いう意味ではない。以下、基本単語のことを、その表示
については、「代表表記」と呼び、派生単語については
「派生表記」と呼ぶ。FIG. 15 shows the details of the word search processing routine. As shown in the figure, when the word search processing routine is activated, first, processing is performed in which the word search start position M is set to the value 1, that is, the head position of the input kana character string (step S402). Next, a process of initializing the variable L indicating the reading length in the word search to the value 1 is performed (step S
404), a process of retrieving this word of reading length L from the independent word dictionary 58 and the auxiliary word dictionary 68 (step S4).
05). Here, the independent word dictionary 58 is composed of a header, an index, and a dictionary body, as shown in FIG. The header is information for managing the dictionary itself. The index and dictionary body are managed by dividing them into basic words, derived words, and meaning examples. The basic word is defined as a word representative of these notations when one word has a derivative notation, for example, when “handling” or “handling” is permitted as the notation for “handling”. It is a word. That is, the basic word only means a word that is typically used in the processing of segmentation and segmentation and dependency processing. Although it is a word recorded in the word dictionary, it is called a representative word, but it does not basically mean a word in the language. Hereinafter, the basic word is referred to as "representative notation" for its display, and as "derivative notation" for derivative words.

【００６９】意味用例についての領域は、第１実施例で
説明した係り受けに関する情報と同一の情報が管理され
ている領域である。したがって、第２実施例では、自立
語辞書５８と係り受け辞書９８とが、一体化されてい
る。意味用例の領域に記憶された情報は、基本単語を中
心とする係り受けの情報である。係り受けの情報と基本
単語および派生単語との関係については、後述する。The area for the meaning example is an area where the same information as the dependency information described in the first embodiment is managed. Therefore, in the second embodiment, the independent word dictionary 58 and the dependency dictionary 98 are integrated. The information stored in the meaning example area is dependency information centered on the basic word. The relationship between the dependency information and the basic words and the derived words will be described later.

【００７０】これらの基本単語，派生単語，意味用例
は、辞書本体においては、Ｂ−Ｔｒｅｅ構造により管理
されている。Ｂ−Ｔｒｅｅ構造は、多数のデータを検索
する場合に採用される周知の管理構造であり、多数のデ
ータが存在する場合、データが適正に編成されていれ
ば、目的とするデータにたどり着くまでの時間が平均的
な時間になる構造として知られている。辞書本体におけ
るＢ-Ｔｒｅｅ構造の一例を図１７に示した。読み（仮
名文字列）に基づいてＢ−Ｔｒｅｅ制御ブロックを辿っ
て単語ブロックに至ると、ここに実際の単語データがお
かれている。These basic words, derived words, and meaning examples are managed by the B-Tree structure in the dictionary body. The B-Tree structure is a well-known management structure that is adopted when retrieving a large number of data. It is known as a structure in which time is an average time. An example of the B-Tree structure in the dictionary body is shown in FIG. When the B-Tree control block is traced to the word block based on the reading (kana character string), the actual word data is placed here.

【００７１】基本単語領域などの単語データは、大まか
には、図１８に示すデータ構造を有している。即ち、先
頭に単語データのデータ長Ｘを示すデータが存在し、そ
の後、Ｘバイトの実データが続いている。実データの先
頭には、見出し語の長さＹが記録されており、続いてＹ
バイトの見出し語が記録されている。実際の単語データ
は、その後に続いている。単語データは、その先頭に単
語長Ｗが記録されており、その直後に漢字データの有無
などを示す１バイトのフラグが記録されている。フラグ
の後には、漢字データが記録されているが、この漢字デ
ータは、漢字データ長と実際の漢字文字列を示す漢字コ
ードから構成されている。その後、単語情報および品詞
情報（場合によっては複数の品詞情報）が記録されてい
る。単語情報は、単語情報の長さを示すデータと、実際
の単語情報とからなる。Word data such as the basic word area has a data structure roughly shown in FIG. That is, data indicating the data length X of word data is present at the beginning, and then X bytes of actual data follow. The headword length Y is recorded at the beginning of the actual data, and then Y
The entry word of the byte is recorded. The actual word data follows. A word length W is recorded at the beginning of the word data, and immediately after that, a 1-byte flag indicating the presence or absence of kanji data is recorded. Kanji data is recorded after the flag. This Kanji data is composed of a Kanji data length and a Kanji code indicating an actual Kanji character string. After that, word information and part-of-speech information (in some cases, a plurality of pieces of part-of-speech information) are recorded. The word information consists of data indicating the length of the word information and the actual word information.

【００７２】このように、基本単語でも派生単語でも、
Ｂ−Ｔｒｅｅ構造を用いて、単語の見出し文字列に基づ
いて、所望の単語に関する情報を取り出すことができ
る。これらの単語情報は、更に図１９に示すように、セ
パレータとこれに続くデータとから構成されている。セ
パレータとしては、それ以後に続くデータが表示される
漢字のデータであることを示す表示漢字セパレータや、
データが読み情報であることを示す読み情報セパレー
タ、派生表記であることを示す派生表記セパレータなど
がある。表示漢字とは、一つの単語に代表表記と派生表
記とがある場合に、デフォルトで漢字を表示するため
に、代表表記に対応する漢字での表記を記録しているも
のである。派生表記セパレータは、図１９に示すよう
に、セパレータの下位３ビットが派生表記の数に対応し
ており、その後に続く派生表記１，派生表記２は、図２
０に例示したように、代表表記に対する変容の形態を番
号で示したものとなっている。即ち、派生表記１が、例
えば番号５であれば、代表表記が「メモリ」であれば、
「長音あり」が派生表記として存在することを示し、
「メモリー」を意味する。即ち、派生表記の情報として
は、実際の派生表記そのものが記憶されている訳ではな
く、派生表記の形態が番号で記憶されているのである。
また、読み情報は、単語の読みを与えるものであり、見
出し語が漢字である場合などにその読みを与えるもので
ある。この情報は、漢字から意味を同じくする他の漢字
を検索する連想変換などの際に用いられる。なお、一つ
の基本単語とこの基本単語（代表表記）に対応する派生
単語（派生表記）とは、別々の領域で管理されている
が、単語の読み（見出し）と単語の品詞情報とが一致す
るものについて、対応関係があるとみなしている。In this way, whether it is a basic word or a derivative word,
The B-Tree structure can be used to retrieve information about a desired word based on the word heading string. As shown in FIG. 19, these pieces of word information are composed of a separator and data following the separator. As a separator, a display Kanji separator that indicates that the data that follows is Kanji data, or
There is a reading information separator indicating that the data is reading information, a derivative notation separator indicating that the data is a derivative notation, and the like. The display kanji is a record of the kanji corresponding to the representative notation in order to display the kanji by default when one word has a representative notation and a derivative notation. As shown in FIG. 19, in the derivative notation separator, the lower 3 bits of the separator correspond to the number of derivative notations, and the derivative notation 1 and the derivative notation 2 that follow are as shown in FIG.
As illustrated in FIG. 0, the form of transformation to the representative notation is indicated by a number. That is, if the derivative notation 1 is, for example, the number 5, and if the representative notation is “memory”,
Indicates that "with long sound" exists as a derivative notation,
Means "memory". That is, as the derivative notation information, the actual derivative notation itself is not stored, but the form of the derivative notation is stored as a number.
Further, the reading information gives the reading of a word, and gives the reading when the headword is a kanji. This information is used at the time of associative conversion for searching other kanji having the same meaning from the kanji. Note that one basic word and a derivative word (derivative notation) corresponding to this basic word (representative notation) are managed in different areas, but the reading (heading) of the word and the part-of-speech information of the word match. What we do is considered to have a correspondence.

【００７３】図１５に戻って、辞書検索を行なった後
（ステップＳ４０５）、開始位置Ｍから長さＬの読みの
仮名文字列に合致する単語が見つかったか否かを判定す
る（ステップＳ４０６）。該当語が見つかった場合に
は、次に、その代表単語に付属するデータのうち派生表
記に関するものを読み込む処理を行なう（ステップＳ４
０８）。代表単語に対して種々の派生表記が存在し、派
生単語管理領域に派生単語が記憶されている場合には、
代表単語の付属データに派生表記への差し替えの必要・
不要を示すフラグが保存されている。そこで、かかるフ
ラグが参照し、代表単語を派生単語に差し替えるよう指
示がなされている単語であるか否かを判断する（ステッ
プＳ４１０）。このフラグに、派生単語への差し替えの
必要を示す値が設定されている場合には、先に検索した
代表単語を展開バッファに展開する共に、その単語にマ
ークを付与する（ステップＳ４１１）。展開バッファと
は、入力した仮名文字列に対して、この仮名文字列を構
成し得る総ての代表単語および付属語を展開するための
記憶領域であり、ＲＡＭ２３上に確保されたメモリ領域
である。Returning to FIG. 15, after performing a dictionary search (step S405), it is determined whether or not a word matching the reading kana character string of length L from the start position M is found (step S406). When the corresponding word is found, next, a process of reading the data related to the derivative notation among the data attached to the representative word is performed (step S4).
08). If there are various derivative notations for the representative word and the derivative word is stored in the derivative word management area,
It is necessary to replace the ancillary data of the representative word with a derivative notation.
A flag indicating unnecessary is stored. Therefore, it is determined whether or not the flag is referred to and the instruction is made to replace the representative word with the derived word (step S410). When a value indicating the necessity of replacement with a derived word is set in this flag, the representative word searched previously is expanded in the expansion buffer and a mark is added to the word (step S411). The expansion buffer is a storage area for expanding all the representative words and auxiliary words that can form this kana character string into the input kana character string, and is a memory area secured on the RAM 23. .

【００７４】派生単語への差し替えの指示がなされてい
ない場合、もしくは派生単語への差し替えが指示されて
いて代表単語にマークを付与した後、処理は、ステップ
Ｓ４０５に戻って、読みの長さＬの語を更に検索する処
理から繰り返す。読みの長さＬの語がもはや自立語辞書
５８に存在しないと判断された場合には（ステップＳ４
０６）、検索単語の長さをのばすことができるか否かを
判断する（ステップＳ４１２）。入力された仮名文字列
の全長さＡに対してＭ＋Ｌ＜Ａならば、読みの長さＬを
大きくすることができると判断し、読みを一文字分長く
する処理（即ち、Ｌを値１だけインクリメントする処
理）を行なう（ステップＳ４１４）。読みの長さＬを値
１だけ増加した後、ステップＳ４０５から上述した処理
を繰り返す。If the instruction to replace the derived word has not been given, or if the replacement to the derived word has been instructed and the representative word has been marked, the process returns to step S405 and the reading length L It repeats from the process of further searching for the word. When it is determined that the word of reading length L no longer exists in the independent word dictionary 58 (step S4).
06), it is determined whether or not the length of the search word can be extended (step S412). If M + L <A with respect to the total length A of the input kana character string, it is determined that the reading length L can be increased, and the reading is lengthened by one character (that is, L is incremented by a value 1). Processing) is performed (step S414). After the reading length L is increased by the value 1, the processing described above is repeated from step S405.

【００７５】この結果、展開バッファには、開始位置Ｍ
における長さ１から最大長さまでの読みの全単語が展開
される。単語の展開およびコスト付与などについては、
第１実施例と同様に行なわれる（図４参照）。読みの長
さを順次長くしていって、単語長が伸ばせなくなると
（ステップＳ４１２）、次に単語検索の開始位置Ｍを、
入力した仮名文字列の末尾に向かって移動可能か否かを
判断する（ステップＳ４１６）。移動可能であれば、そ
の開始位置Ｍを先頭とする単語の検索はすべて終わった
と判断し、開始位置Ｍを値１だけインクリメントする処
理を行なった後（ステップＳ４１８）、読みの長さＬを
値１に戻して、上述した処理を繰り返すか越す。従っ
て、これらの処理が行なわれると、展開バッファには、
入力した仮名文字列を構成し得る可能性のある総ての代
表単語および付属語が展開され、かつ派生表記のある代
表単語については、これにマークを付与した状態とされ
る。As a result, the start position M is stored in the expansion buffer.
All reading words from length 1 to the maximum length in are expanded. For word expansion and cost addition,
The procedure is the same as in the first embodiment (see FIG. 4). When the reading length is gradually increased and the word length cannot be extended (step S412), the start position M of the word search is changed to
It is determined whether the input kana character string can be moved toward the end (step S416). If the word can be moved, it is determined that all the search for the word starting from the start position M has been completed, and after the process of incrementing the start position M by 1 is performed (step S418), the reading length L is set to the value. Return to 1 and repeat the above process or go over. Therefore, when these processes are performed,
All the representative words and adjuncts that have the possibility of forming the inputted kana character string are expanded, and the representative words with derivative notations are marked.

【００７６】以上の処理により単語検索処理（図１４ス
テップＳ４００）が完了する。そこで、次に文節分かち
書き処理が行なわれる（ステップＳ４２０）。文節分か
ち書きは、周知のものであり、展開バッファに展開した
上記単語を用い、各単語に付与した値の総和がもっとも
小さな値となるように、文節の組み合わせを決定する。
文節分かち書きの処理については第１実施例と変わると
ころは特にない。With the above processing, the word search processing (step S400 in FIG. 14) is completed. Then, the phrase segmentation processing is next performed (step S420). The phrase segmentation is well known, and the above-mentioned words expanded in the expansion buffer are used to determine the combination of the phrases so that the sum of the values given to each word becomes the smallest value.
There is no particular difference between the phrase segmentation processing and the first embodiment.

【００７７】次に係り受けの検定処理を行なう（ステッ
プＳ４３０）。係り受けの検定についてもその内容は第
１実施例と同様であるが、本実施例では、上述したよう
に、展開バッファには代表単語のみが展開されており、
派生単語は展開されていない。第１実施例では、例えば
「規則」＋「が」＋「変わる」という係り受けが存在す
る場合、「変わる」について派生表記「変る」が存在す
れば、係り受け辞書には「規則が変わる」という係り受
けと「規則が変る」という係り受けとが記憶されてい
た。本実施例では、係り受けの検定自体は、代表表記の
みで行なうので、「きそく」＋「が」＋「かわる」とい
う文字列に対する係り受けの検定は、「規則」＋「が」
＋「変わる」のみについて行なわれる。したがって、係
り受けの検定に要する時間は短縮されている。なお、本
実施例では、係り受け辞書は、自立語辞書５８の内部に
含まれており、意味用例の管理領域に記憶されている。
係り受け辞書の一例を図２１に示す。係り受け辞書の内
容は、第１実施例（図７参照）と同様、読み、受け語、
係り語、許容する付属語からなる。また、使役や受動に
ついても係り受けの判断や、係り受けの規則についても
第１実施例と同様の規則を適用している（図１３参
照）。Next, a dependency verification process is performed (step S430). The content of the dependency test is the same as that of the first embodiment, but in this embodiment, as described above, only the representative word is expanded in the expansion buffer.
Derived words have not been expanded. In the first embodiment, for example, when there is a dependency of “rule” + “ga” + “change”, if the derivative notation “change” exists for “change”, “rule changes” in the dependency dictionary. And the dependency that "rules change" was remembered. In this embodiment, the dependency test itself is performed only by using the representative notation. Therefore, the dependency test for the character string “Kisoku” + “ga” + “change” is “rule” + “ga”.
+ Only for "change". Therefore, the time required for the dependency check is shortened. In this embodiment, the dependency dictionary is included in the independent word dictionary 58 and is stored in the management area of the meaning example.
FIG. 21 shows an example of the dependency dictionary. The contents of the dependency dictionary are similar to those in the first embodiment (see FIG. 7), that is, reading,
It consists of a related word and an adjunct word that is allowed. In addition, the same rules as those in the first embodiment are applied to the judgment of dependency and the rule of dependency also for causative and passive (see FIG. 13).

【００７８】こうして係り受けの検定を行ない、文節分
かち書きによっては決定できない単語候補（例えば「き
そくがかわる」における「変わる」と「替わる」）が見
いだされたものについて、係り受けが成立する単語が見
いだされれば、この単語を第１候補とする処理が行なわ
れる。その後、第１候補とされた単語について、単語の
表記を差し替える処理を行なう（ステップＳ４４０）。
単語の差し替え処理は、第１候補とされた語（例えば
「変わる」）について派生表記があるか否かを判断し、
派生表記が存在する場合には、代表表記，派生表記の中
で、最前に使用された表記を調べ、その表記に差し替え
るものである。直前に使用された語が「変る」であれ
ば、単語検索，分かち書き処理，係り受けの検定で一貫
して用いてきた代表単語に代えて、派生表記である「変
る」を用いるのである。In this way, when the dependency test is performed and a word candidate that cannot be determined by the phrase segmentation (for example, “change” and “alternate” in “kisoku ga kawa”) is found, the words for which the dependency is established are determined. If found, processing is performed with this word as the first candidate. After that, the word of the first candidate is replaced with the notation of the word (step S440).
The word replacement process determines whether or not there is a derivative notation for the word selected as the first candidate (for example, “change”),
When there is a derivative notation, the representative notation or derivative notation used first is checked and replaced with that notation. If the word used immediately before is "change," the derivative notation "change" is used in place of the representative word that has been consistently used in the word search, the segmentation process, and the dependency test.

【００７９】単語差し替え処理を行なう処理ルーチンを
図２２に示す。この処理ルーチンでは、まず差し替えを
行なう対象単語を、文節分かち書き処理により分かち書
きされた最初の文節の単語に設定し（ステップＳ５０
０）、その対象単語にマークがついているか否かの判断
を行なう（ステップＳ５１０）。マークがついていれば
派生表記が存在しかつ派生表記への差し替えが指示され
ている判断できるから、その後の第１候補を表示単語に
差し替える処理を行なう（ステップＳ５２０）。表示単
語とは、代表表記が以前に使用されていれば代表単語そ
のものであり、以前に派生表記が使用され学習されてい
ればその派生単語である。対象単語にマークがついてい
なければ、表示単語の差し替えは行なわない。FIG. 22 shows a processing routine for performing the word replacement processing. In this processing routine, the target word to be replaced is first set to the word of the first phrase segmented by the segment segmentation process (step S50).
0), it is determined whether or not the target word is marked (step S510). If the mark is attached, it can be determined that there is a derivative notation and replacement of the derivative notation is instructed. Therefore, the process of replacing the first candidate with the display word is performed thereafter (step S520). The display word is the representative word itself if the representative notation has been used before, and the derivative word if the derivative notation has been previously used and learned. If the target word is not marked, the displayed word is not replaced.

【００８０】その後、未処理の単語がまだ残っているか
を判断し（ステップＳ５３０）、残っていれば対象単語
を一つ後ろにずらし（ステップＳ５４０）、上述したス
テップＳ５１０から処理を繰り返す。対象単語が残って
いなければ、本処理ルーチンを終了する。なお、上述し
た処理において、派生表記が存在する場合に表示単語と
してどの表記を用いるかの学習は、派生単語の管理領域
の先頭に存在する単語を表示単語として利用するものと
すれば容易である。この場合、代表単語を使用する場合
には、派生単語の先頭に、代表表記自体（もしくは代表
表記に相当する派生表記情報）を記憶しておいても良い
し、派生単語の先頭に代表表記を使用するか派生表記を
使用するかを示すフラグを記憶するものとしても良い。
また、派生表記を用いる場合には、その表記を代表単語
の管理領域に記憶しておくことも可能である。Thereafter, it is judged whether or not there are still unprocessed words (step S530), and if there are any unprocessed words, the target word is moved backward by one (step S540), and the processing is repeated from step S510. If no target word remains, this processing routine ends. In the above-described processing, learning which notation is used as the display word when there is a derivative notation is easy if the word existing at the head of the management area of the derivative word is used as the display word. . In this case, when using a representative word, the representative notation itself (or derivative notation information corresponding to the representative notation) may be stored at the beginning of the derivative word, or the representative notation may be stored at the beginning of the derivative word. A flag indicating whether to use or the derivative notation may be stored.
Further, when using the derivative notation, it is possible to store the notation in the management area of the representative word.

【００８１】単語差し替え処理に続く、表示処理では、
差し替えた単語の表記に従い、仮名漢字変換された後の
語候補をＣＲＴ２６上に表示する処理を行なう。なお、
派生表記は、図１９および図２０に示したように、派生
表記に対応した漢字を用意しているのではなく、派生表
記の種別を示す番号を、派生表記セパレータの後に記録
しているに過ぎない。従って、表示処理では、この番号
に従って、例えば派生表記情報が「４」であれば、送り
がなの「許容」であると判断し、代表単語「変わる」を
「変る」と表示するのである。In the display process following the word replacement process,
According to the notation of the replaced word, a process of displaying the word candidate after the kana-kanji conversion on the CRT 26 is performed. In addition,
As for the derivative notation, as shown in FIGS. 19 and 20, the kanji corresponding to the derivative notation is not prepared, but only the number indicating the type of the derivative notation is recorded after the derivative notation separator. Absent. Therefore, in the display processing, according to this number, for example, if the derivative notation information is "4", it is determined that the feed is "permissible" and the representative word "change" is displayed as "change".

【００８２】以上説明した本実施例によれば、入力され
た仮名文字列に対して単語を検索し分かち書き処理を行
ない、係り受け処理を行なうまでは、語候補は代表単語
のみを用いる。したがって、派生表記が認められている
単語を検索した場合でも、派生表記については考慮する
必要がなく、各々の処理を高速に行なうことができる。
また、語候補を展開する展開バッファの容量も小さなも
のですませることができる。更に、複数の係り受けが存
在する場合でも、係り受けの判定を容易に行なうことが
できるという利点が得られる。これを、図２１の例を用
いて説明する。図２１に例示するように、「規則」＋
「が」＋「変わる」、「取り扱い」＋「が」＋「変わ
る」、「荷物」＋「の」＋「取り扱い」という三種類の
係り受けがあり、各単語について「変る」「取扱い」と
いう派生表記が存在するとする。この場合、例えば「規
則が変わる」について、「規則」＋「が」＋「変る」と
いう表記が学習されたとする。この場合、「変わる」と
「変る」を別々に係り受け情報として管理していると、
「とりあつかいがかわる」を変換する際と、「きそくが
かわる」の変換において「変る」が学習されていても、
「取り扱い」＋「が」＋「変わる」と変換されてしま
う。これに対して本実施例では、「変わる」については
「本則」ではなく派生表記の「許容」を用いると言う学
習がなされることになるので、他の係り受けが成立して
も「変わる」については「変る」が一貫して用いられる
ことになる。According to the present embodiment described above, only the representative word is used as the word candidate until the input kana character string is searched for a word, is divided into words, and is subjected to the dependency processing. Therefore, even when a word for which the derivative notation is recognized is searched, it is not necessary to consider the derivative notation, and each processing can be performed at high speed.
Also, the capacity of the expansion buffer for expanding word candidates can be small. Further, even when there are a plurality of dependencies, there is an advantage that the determination of the dependencies can be easily performed. This will be described with reference to the example of FIG. As illustrated in FIG. 21, “rule” +
There are three types of dependencies: "ga" + "change", "handling" + "ga" + "change", "luggage" + "no" + "handling", and each word is called "changing""handling". Suppose there is a derivative notation. In this case, for example, it is assumed that the notation “rule” + “ga” + “change” is learned for “rule changes”. In this case, if "change" and "change" are managed separately as dependency information,
Even when "changing" is learned during the conversion of "Take care changes" and during the conversion of "Kisoku changes",
"Handling" + "ga" + "change" will be converted. On the other hand, in the present embodiment, learning to use the derivative notation “tolerance” is used for “change” instead of “main rule”, and thus “change” even if another dependency is established. Will be used consistently.

【００８３】また、「にもつのとりあつかいがかわる」
を変換する場合には、従来の係り受けの判断では、「荷
物の／取り扱い」という係り受け情報があっても、「荷
物の／取扱い」という係り受けが存在しないと、「荷物
の／取り扱いが／変わる」という係り受けの連鎖の成立
と、「取扱いが／変わる」という連鎖の成立とを比較
し、前者の係り受けの成立を優先してしまうことにな
る。この結果、直前に「取扱いが／変わる」が学習され
ていても、「荷物の／取り扱いが／変わる」と変換され
てしまう。これに対して本実施例では、「取り扱い」と
いう語について「許容」の派生表記を用いることが学習
されるのみなので、一旦「取扱い」を学習すれば正しく
「荷物の／取扱いが／変わる」と変換することになる。In addition, "there is a different way to deal with the problem"
In the case of conversion of the above, according to the conventional judgment of dependency, even if there is dependency information of "luggage / handling", if there is no dependency of "luggage / handling", "luggage / handling is By comparing the establishment of the dependency chain of “/ change” with the establishment of the chain of “handling / change”, the former dependency will be given priority. As a result, even if "handling / change" is learned immediately before, it will be converted to "package / handling / change". On the other hand, in the present embodiment, since only the derivative notation of “acceptance” is learned for the word “handling”, once “handling” is learned, it is correctly said that “package / handling / change”. Will be converted.

【００８４】なお、この実施例では、単語検索，文節分
かち書き処理，係り受け検定までを代表単語で行なって
いるが、これらの処理の一部を代表表記と派生表記とを
用いて行なうものとすることも差し支えない。In this embodiment, the word search, the phrase segmentation processing, and the dependency check are performed with the representative word, but a part of these processing is performed using the representative notation and the derivative notation. It doesn't matter.

【００８５】以上本発明のいくつかの実施例について説
明したが、本発明はこうした実施例に何等限定されるも
のではなく、例えば最小コスト法に代えて２文節最長一
致法などの他の文節分かち書きの手法を用いた構成な
ど、本発明の要旨を逸脱しない範囲内において、種々な
る態様で実施し得ることは勿論である。Although some embodiments of the present invention have been described above, the present invention is not limited to these embodiments. For example, instead of the minimum cost method, another phrase segmentation method such as the two-segment longest matching method is used. It is needless to say that the present invention can be implemented in various modes without departing from the gist of the present invention, such as a configuration using the method described above.

【００８６】[0086]

【発明の効果】以上説明したように本発明の第１の仮名
漢字変換装置および仮名漢字変換方法では、係り語と受
け語の情報に該当する文節が検索されたとき、係り語に
付属する語が許容付属語であるかの判定、および受け語
に続く付属語が使役もしくは受動を表わす語である場合
には、前記係り語に付属する語が、使役もしくは受動に
対応した語であるかの判定を行ない、この検索結果に基
づいて、文節分かち書きの候補を制限する。従って、係
り受けの情報が存在する場合、単純に両語の存在によっ
て係り受けの成立を見るのではなく、係り語に付属する
語が、両語の関係を許す場合に係り受けの成立とし、か
つ受け語が使役または受動を示す語である場合には、係
り語に付属する語が使役または受動に対応した語である
場合に、係り受けの成立として、文節候補の制限を行な
うから、文節分かち書きの非所望な候補は選択され難く
なり、所望の分かち書きがなされる可能性が高くなると
いう優れた効果を奏する。なお、文節候補の制限に代え
て、既に他の手法により推定された文節分かち書きを前
提として、各文節毎の漢字候補の優先順位を変更するも
のとすれば、所望の漢字候補を第１候補として得られる
可能性を高めることができる。単に係り受けの情報を用
いて単語の候補の優先順位を変更するだけでなく、文節
分かち書きの段階で係り受けの情報に用い、更に係り受
けの成立を使役や受動の場合にも判定できるので、高次
の言語活動である係り受けを生かした分かち書き候補を
得ることができるのである。また、使役・受動の係り受
けの解析に特別な辞書を用意する必要がなく、処理も高
速に行なうことができる。As described above, according to the first kana-kanji conversion device and kana-kanji conversion method of the present invention, when the phrase corresponding to the information of the related word and the related word is retrieved, the word attached to the related word Is a permissible adjunct word, and if the adjunct word following the word is a word representing causative or passive, then whether the word attached to the related word is a word corresponding to causative or passive A judgment is made and based on this search result, candidates for phrase segmentation are restricted. Therefore, when the dependency information exists, the dependency is not established simply by the existence of both words, but the dependency is established when the word attached to the dependency allows the relationship between the two terms, In addition, if the received word is a word indicating causative or passive, and if the word attached to the dependent word is a word corresponding to causative or passive, then the candidate candidates are restricted as the establishment of the dependent word. An undesired candidate for segmentation becomes difficult to be selected, and there is an excellent effect that there is a high possibility that desired segmentation will be performed. If the priority of the kanji candidates for each bunsetsu is changed instead of the restriction of the bunsetsu candidates based on the bunsetsu segmentation already estimated by another method, the desired kanji candidate is set as the first candidate. The possibility of being obtained can be increased. Not only changing the priority of word candidates by using dependency information, it is also used for dependency information at the stage of phrase segmentation, and the success of dependency can be determined even in the case of causative or passive. It is possible to obtain the word-spacing candidates that make use of the dependency, which is a higher-level language activity. In addition, it is not necessary to prepare a special dictionary for analyzing causative / passive dependency, and the processing can be performed at high speed.

【００８７】この他、請求項２の仮名漢字変換装置によ
れば、係り受けの検定においては、代表表記のみを用い
るので、係り受けの検定に要する時間および必要なメモ
リ容量を低減できるという効果を奏する。更に、請求項
３の仮名漢字変換装置では、上記の使役・受動を含む係
り受け情報を用いて、文節分かち書きの制限に代えて文
節毎の漢字候補の優先順位を変更するから、係り受けの
情報を用いて所望の漢字候補を高い優先順位で得ること
が可能となる。また、請求項４の仮名漢字変換装置によ
れば、分かち書きのための検索時間が短くて済み、更に
鎖交した係り受けを誤って選択するということがないと
いう効果を奏する。In addition, according to the kana-kanji conversion device of the second aspect, since only the representative notation is used in the dependency check, it is possible to reduce the time required for the dependency check and the required memory capacity. Play. Furthermore, in the kana-kanji conversion device according to claim 3, the dependency information including causative / passive is used to change the priority of kanji candidates for each phrase instead of the restriction of segmentation and spacing. Using, it is possible to obtain a desired Kanji candidate with high priority. Further, according to the kana-kanji conversion device of the fourth aspect, there is an effect that the search time for separating words is short, and further, the chained dependency is not erroneously selected.

【００８８】請求項５記載の仮名漢字変換装置によれ
ば、係り受けの成立する語ほど優先的に文節候補にでき
るという効果を奏する。また、請求項６記載の仮名漢字
変換装置によれば、係り受けに基づいて文節分かち書き
を優先的に選択すると共に、係り受けの成立する語を仮
名漢字変換の第１候補とするという効果を奏する。According to the kana-kanji conversion device of the fifth aspect, it is possible to preferentially make a bunsetsu candidate for a word whose dependency is satisfied. In addition, according to the kana-kanji conversion device according to claim 6, it is possible to preferentially select the phrase segmentation based on the dependency and to make the word for which the dependency is satisfied the first candidate for the kana-kanji conversion. .

【００８９】請求項７記載の仮名漢字変換装置によれ
ば、付属語に関する情報量を低減することが可能とな
る。請求項８記載の仮名漢字変換装置によれば、係り受
けを形成する係り語と受け語との間に修飾する語が存在
するような場合にも、係り受けを見い出すことができる
という効果を奏する。加えて、この係り受けの関係が見
いだされたとき、前記起点となった文節から該見いださ
れた文節までの範囲を、次の係り受けの検索範囲から除
外するものとすれば、係り受けの交差を排除して正しい
係り受けの誤判定を回避すると共に、係り受けの検索の
高速化を図ることが可能となる。According to the kana-kanji conversion device of the seventh aspect, it is possible to reduce the amount of information about the attached word. The kana-kanji conversion device according to claim 8 has an effect of being able to find a dependency even when there is a modifier word between the modifier forming the modifier and the modifier. . In addition, when this dependency relation is found, if the range from the starting bunsetsu to the found bunsetsu is excluded from the search range of the next dependency, the dependency crossing It is possible to eliminate erroneous determination of correct dependency and to speed up the dependency search.

【００９０】請求項９記載の仮名漢字変換装置によれ
ば、係り受けの関係にある単語を含む文節が見いだされ
なかった場合にも、適正な単語候補を選択することがで
きる。According to the kana-kanji conversion device of the ninth aspect, it is possible to select an appropriate word candidate even when no phrase including a word having a dependency relationship is found.

【００９１】本発明の第２の仮名漢字変換装置および仮
名漢字変換方法によれば、係り受け情報辞書を参照して
係り受け検定を行なう際、単語辞書に登録された代表表
記のみを用いて検定を行なって、入力された文字列を構
成する単語を特定するが、この特定された単語の候補文
字列を表示する際には、単語辞書に記憶された代表表記
のみならず、派生表記を用いて表示を行なうから、係り
受けを含む文法処理の高速化と表記の多様性とを両立さ
せることができるという効果を奏する。According to the second kana-kanji conversion device and kana-kanji conversion method of the present invention, when the dependency check is performed by referring to the dependency information dictionary, only the representative notation registered in the word dictionary is used for the validation. And specify the words that make up the input character string.When displaying the candidate character strings of this specified word, not only the representative notation stored in the word dictionary but also the derivative notation is used. Since the display is performed as described above, it is possible to achieve both high speed grammar processing including dependency and versatility of notation.

[Brief description of drawings]

【図１】本発明の一実施例である仮名漢字変換装置にお
ける仮名漢字変換機能の実現形態を示す機能ブロック図
である。FIG. 1 is a functional block diagram showing an implementation form of a kana-kanji conversion function in a kana-kanji conversion device according to an embodiment of the present invention.

【図２】実施例としての仮名漢字変換装置が実現される
ハードウェアを示すブロック図である。FIG. 2 is a block diagram showing hardware for realizing a kana-kanji conversion device as an example.

【図３】文節分かち書き部１０２において実行される文
節分かち書き処理を示すフローチャートである。FIG. 3 is a flowchart showing a phrase segmentation writing process executed in a segmentation segment writing unit 102.

【図４】最小コスト法による文節分かち書きの様子を示
す説明図である。FIG. 4 is an explanatory diagram showing a state of phrase segmentation by the minimum cost method.

【図５】最小コスト法による文節分かち書きの他の例を
示す説明図である。FIG. 5 is an explanatory diagram showing another example of phrase segmentation by the minimum cost method.

【図６】実施例における係り受け検定の処理を示す説明
図である。FIG. 6 is an explanatory diagram showing processing of a dependency check in the embodiment.

【図７】第１実施例における係り受け辞書の一例を示す
説明図である。FIG. 7 is an explanatory diagram showing an example of a dependency dictionary according to the first embodiment.

【図８】係り受けの情報を用いて行なわれる文節分かち
書きの処理の一例を示す説明図である。FIG. 8 is an explanatory diagram showing an example of a phrase segmentation writing process performed using dependency information.

【図９】同じく他の文例の処理の様子を示す説明図であ
る。FIG. 9 is an explanatory diagram showing a state of processing of another sentence example.

【図１０】同じくその場合の仮名漢字変換の候補の優先
順位を示す説明図である。FIG. 10 is an explanatory diagram showing priorities of Kana-Kanji conversion candidates in the same case.

【図１１】異なる類型の係り受けの処理の様子を示す説
明図である。FIG. 11 is an explanatory diagram showing a state of processing of dependencies of different types.

【図１２】その場合の仮名漢字変換の候補の優先順位を
示す説明図である。FIG. 12 is an explanatory diagram showing the order of priority of kana-kanji conversion candidates in that case.

【図１３】一つの入力文字列に内に複数の係り受けが存
在する場合の類型を示す説明図である。FIG. 13 is an explanatory diagram showing a type in the case where a plurality of dependencies exist in one input character string.

【図１４】第２実施例における仮名漢字変換処理ルーチ
ンを示すフローチャートである。FIG. 14 is a flowchart showing a kana-kanji conversion processing routine in the second embodiment.

【図１５】その単語検索処理の詳細を示すフローチャー
トである。FIG. 15 is a flowchart showing details of the word search process.

【図１６】自立語辞書５８の内部構成を示す説明図であ
る。16 is an explanatory diagram showing an internal configuration of an independent word dictionary 58. FIG.

【図１７】基本単語領域の管理の様子を示す説明図であ
る。FIG. 17 is an explanatory diagram showing how the basic word area is managed.

【図１８】単語データの構成を示す説明図である。FIG. 18 is an explanatory diagram showing a structure of word data.

【図１９】単語情報の詳細を各セパレータとともに示す
説明図である。FIG. 19 is an explanatory diagram showing details of word information together with each separator.

【図２０】派生表記の例を示す説明図である。FIG. 20 is an explanatory diagram showing an example of derivative notation.

【図２１】係り受け辞書の概略構成と代表表記・派生表
記の一例を示す説明図である。FIG. 21 is an explanatory diagram showing an example of a schematic configuration of a dependency dictionary and representative notation / derivative notation.

【図２２】単語差し替え処理ルーチンを示すフローチャ
ートである。FIG. 22 is a flowchart showing a word replacement processing routine.

[Explanation of symbols]

２１…ＣＰＵ２２…ＲＯＭ２３…ＲＡＭ２４…キーボード２５…キーボードインタフェース２６…ＣＲＴ２７…ＣＲＴＣ２８…プリンタ２９…プリンタインタフェース３０…ハードディスクコントローラ（ＨＤＣ）３１…バス３２…ハードディスク４０…文字入力部４２…変換制御部４４…変換後文字列出力部５０…文字列入力部５２…文字格納部５４…自立語候補作成部５６…自立語解析位置管理部５８…自立語辞書６４…付属語候補作成部６６…付属語解析位置管理部６８…付属語辞書７０…係り受け学習部７０…学習部７２…自立語学習部７４…補助語学習部７６…接辞学習部７８…文字変換学習部８０…単語データ作成部８２…接続検定部８４…接続検定テーブル８６…コスト計算部９０…係り受け候補調整部９２…受動解析部９４…助詞許容解析部９６…係り受け範囲管理部９８…係り受け辞書１００…単語データ格納部１０２…文節分かち書き部１０４…係り受け転置情報調整部１０６…文節データ格納部１０８…変換文字列出力部 21 ... CPU 22 ... ROM 23 ... RAM 24 ... Keyboard 25 ... Keyboard interface 26 ... CRT 27 ... CRTC 28 ... Printer 29 ... Printer interface 30 ... Hard disk controller (HDC) 31 ... Bus 32 ... Hard disk 40 ... Character input section 42 ... Conversion Control unit 44 ... Converted character string output unit 50 ... Character string input unit 52 ... Character storage unit 54 ... Independent word candidate creation unit 56 ... Independent word analysis position management unit 58 ... Independent word dictionary 64 ... Adjunct word creation unit 66 ... Adjunct analysis position management unit 68 ... Adjunct dictionary 70 ... Dependency learning unit 70 ... Learning unit 72 ... Independent word learning unit 74 ... Auxiliary word learning unit 76 ... Affix learning unit 78 ... Character conversion learning unit 80 ... Word data creation unit 82 ... Connection verification unit 84 ... Connection verification table 86 ... Cost calculation unit 90 ... Dependency candidate adjustment 92 ... Passive analysis unit 94 ... Particle admissible analysis unit 96 ... Dependency range management unit 98 ... Dependency dictionary 100 ... Word data storage unit 102 ... Phrase segmentation / writing unit 104 ... Dependency transposition information adjustment unit 106 ... Phrase data storage unit 108 ... Converted character string output section

Claims

[Claims]

1. A kana character string is input, the input kana character string is referred to by a word dictionary, and syllabic puncturing is performed to create a syllabic segment slashing candidate. A kana-kanji conversion device that generates a candidate character string to be configured, wherein information on a dependent word and a dependent word that compose a modification of a predetermined bunsetsu, A dependency information dictionary stored together with information, and a phrase provided with the dependency word and a word corresponding to the information of the dependency word when performing the processing of writing the input character string into phrase segmentation And a phrase search means for searching for a phrase having a word corresponding to the information of the related word and the received word, and determining whether the word attached to the related word is the allowable attached word. And the judgment means of When a phrase provided with a word corresponding to the information of the dependent word and the dependent word is searched, when the dependent word following the dependent word is a word indicating causative or passive, the word attached to the dependent word is The system further comprises: second determining means for determining whether the word corresponds to causative or passive; and a phrase candidate limiting means for limiting the phrase segmentation-and-scribbing candidates based on the determination results of the first and second determining means. Kana-Kanji conversion device.

2. The kana-kanji conversion device according to claim 1, wherein the dictionary is a word dictionary storing a reading of a word and a notation corresponding to the reading, and as a notation corresponding to the reading of the word. If there are a plurality of notations, the notation information unit stores the notation determined as the representative notation and the notation determined as the derivative notation, and the phrase searching means is the representative notation stored in the word dictionary. It is a means for searching the dependency using only the phrase, and for the phrase segmentation candidate, using the representative notation and the derivative notation stored in the notation information section of the word dictionary, the converted candidate character string A kana-kanji conversion device having means for displaying candidate character strings.

3. The kana-kanji conversion device according to claim 1, wherein instead of the phrase candidate limiting means, the kanji for each phrase based on the determination results of the first and second determination means. A kana-kanji conversion device equipped with a kanji candidate priority means for changing the priority order of candidates.

4. The kana-kanji conversion device according to claim 1, wherein the bunsetsu searching means sequentially moves forward from the backward bunsetsu as a starting point, excluding the already searched range. A retrograde search means for searching a phrase having a word corresponding to the receiving information, and a phrase having a word corresponding to the dependency information is found by the search, the found phrase from the starting phrase A kana-kanji conversion device provided with already searched range registration means for registering up to the above as a searched range of dependency information.

5. The kana-kanji conversion device according to claim 1, wherein the phrase candidate limiting means is configured such that the first or second determining means determines whether the relevant related word, the dependent word, or the related word is applicable. A kana-kanji conversion device comprising means for preferentially selecting a phrase segmentation containing the word when the existence is determined.

6. The kana-kanji conversion device according to claim 3, wherein, in the kanji candidate prioritizing means, the first or second determining means determines presence of a relevant related word, an adjective and its annex. When determined, a kana-kanji conversion device including means for preferentially selecting a phrase segmentation including the word and selecting the word as a first candidate for kana-kanji conversion.

7. The bunsetsu searching means determines that the dependency relation is established when the auxiliary word existing between the words having the dependency relation has a predetermined specific grammatical structure. The kana-kanji conversion device according to claim 1, further comprising means.

8. The kana-kanji conversion device according to claim 1, wherein the phrase search means refers to the dependency information stored in the dependency dictionary with a predetermined phrase as a starting point, and changes the dependency. When a bunsetsu provided with a word corresponding to information is searched for up to a bunsetsu other than a bunsetsu adjacent to the bunsetsu which is the starting point, and when a dependency relation is found by the bunsetsu bunsetsu searching means, the starting point A kana-kanji conversion device comprising a search range excluding means for excluding the range from the found phrase to the found phrase from the search range by the next alternate phrase searching means.

9. The kana-kanji conversion device according to any one of claims 1 to 3, wherein, in a range in which a phrase including a word having a dependency relation is not found by referring to the dependency dictionary, a combination between words is used. And a kana-kanji conversion device equipped with a means for selecting a combination that maximizes the likelihood of binding between phrases.

10. A kana-kanji conversion for inputting a kana character string, writing out the kana character string that has been input, and writing out a kanji character string, and generating a candidate character string that forms a kana-kanji mixed sentence corresponding to the input kana character string. A device, which is a word dictionary storing a reading of a word and a notation corresponding to the reading, and when there are a plurality of notations corresponding to the reading of the word, a notation defined as a representative notation And a word dictionary having a notation information section that stores the notation determined as a derivative notation, and a dependency information dictionary that stores information on the dependent words and the dependent words that make up the modification of predetermined clauses, A word-spacing processing means for performing phrase segmentation processing using a word dictionary, and for each segment segmented, using the representative notation stored in the word dictionary, the dependency information dictionary Grammar processing means for performing a dependency check to specify a word constituting the input character string, and representative notation and derivative notation stored in the notation information part of the word dictionary for the specified word A kana-kanji conversion device comprising: a candidate character string display means for displaying a candidate character string after conversion.

11. A kana-kanji conversion method for inputting a kana character string, referring to a dictionary, and writing out the input kana character string for each phrase, and generating a kana-kanji mixed character string candidate. When performing the process of segmenting a character string into phrases, refer to the dependency information that stores the dependency information of predetermined phrases together with the information of the admissible adjunct word permitted between the dependency word and the dependency word. When a phrase having a word corresponding to the received information is searched, and a phrase having a word corresponding to the information of the related word and the received word is searched and found, the word attached to the related word is the permissible attachment. If the adjunct word following the received word is a word indicating causative or passive, it is judged whether the word attached to the related word is a word corresponding to causative or passive. , The judgment result of either Kanji conversion method which limits the candidates of the phrase word-separated based on.

12. The kana-kanji conversion method according to claim 11, wherein the kanji candidate priority for each phrase is prioritized based on any one of the determination results, instead of the process of restricting the phrase segmentation candidates. Kana-Kanji conversion method that performs the changing process.

13. A kana-kanji conversion for inputting a kana character string, writing out the kana character string that has been input, and writing out a kanji character string, and generating a candidate character string that forms a kana-kanji mixed sentence corresponding to the input kana character string. A method, wherein, in a word dictionary that stores a reading of a word and a notation corresponding to the reading, as a notation corresponding to the reading of the word, when there are a plurality of notations, a notation determined as a representative notation, The denotation defined as a derivative notation is stored, and the dependency information dictionary stores the information of the dependent word and the dependent word that make up the modification of predetermined phrases, and performs the phrase segmentation processing using the word dictionary. For each of the segmented sentences, only the representative notation stored in the word dictionary is used to perform a dependency test using the dependency information dictionary to construct the input character string. That word to identify, for the word that was said identified using the representative title and derived representation stored in said word dictionary, a kana-kanji conversion method for displaying the candidate character string after conversion.