JP2633824B2

JP2633824B2 - Kana-Kanji conversion device

Info

Publication number: JP2633824B2
Application number: JP59044025A
Authority: JP
Inventors: 正博阿部; 正紀川瀬
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 1984-03-09
Filing date: 1984-03-09
Publication date: 1997-07-23
Anticipated expiration: 2012-07-23
Also published as: JPS60189565A

Description

【発明の詳細な説明】〔発明の利用分野〕本発明は仮名入力を漢字仮名混り文に変換する仮名漢
字変換装置に係り、特に入力が文節単位に分ち書きされ
なくても変換可能な、いわゆるべた書き入力に好適な仮
名漢字変換装置に関するものである。Description: FIELD OF THE INVENTION The present invention relates to a kana-kanji conversion device for converting a kana input to a kana-kana mixed sentence, and more particularly to a kana-kanji conversion device that can convert even if the input is not divided into phrases. The present invention relates to a kana-kanji conversion device suitable for so-called solid input.

[Background of the Invention]

従来、べた書き入力の仮名漢字変換の方法としては、
日立評論昭和56年５月第63巻第５号「HITAC L−320/30
H,50H文書処理機能」（以下文献（１）と呼ぶ）に述べ
られている最長一致とバツクトラツクを組合せた方法、
昭和53年度情報処理学会第19回全国大会論文集5E−４
「べた書き文のカナ漢字変換システム」以下「文線
（２）と呼ぶ）に述べられている二文節最長一致法、昭
和56年情報処理学会計算言語学研究会資料25−６「表方
式を用いた文節構造分析アルゴリズムとその能率につい
て」（以下文献（３）と呼ぶ）に述べられている文節数
最小法などがよく知られている。Conventionally, the method of kana-kanji conversion of solid input is as follows.
Hitachi Review, May 63, 1981, Vol. 5, No. 5, "HITAC L-320 / 30
H, 50H document processing function ”(hereinafter referred to as reference (1)), a method combining longest match and backtrack,
1978 19th Annual Convention of the Information Processing Society of Japan 5E-4
"Kana-Kanji conversion system for solid text", hereinafter referred to as "Sentence line (2)", the two-phrase longest matching method. About the phrase structure analysis algorithm used and its efficiency "(hereinafter referred to as reference (3)), the minimum number of phrases method and the like are well known.

文献（１）の方法は、入力左端から最長一致により自
立語を切り出し、次にその自立語に文法的に接続可能な
付属語を引当てる文節変換処理を右端に達するまで繰返
し行うもので、右端に達しない場合はバツクトラツク
（後戻り）して別の文節変換を試み先へ進むものであ
る。In the method of Document (1), an independent word is cut out from the left end of the input by the longest match, and then a phrase conversion process of assigning a grammatically connectable auxiliary word to the independent word is repeated until the right end is reached. If not reached, backtracking is performed and another phrase conversion is attempted and the process proceeds to the destination.

文献（２）の方法は、入力左端から２文節にわたつて
全ての可能な変換候補を総当り的に抽出しその中から２
文節の長さの和が最長となるもの選んで１文節目の切り
目とし、今度はその点を始点として同じことを繰返すこ
とにより１文節ずつ切れ目を決定しながら変換する方法
である。In the method of Document (2), all possible conversion candidates are brute-force extracted over two phrases from the left end of the input, and 2
This is a method in which the sum of the lengths of the clauses is selected to be the longest, and this is used as the cut of the first clause, and this point is used as a starting point and the same is repeated to determine the breaks of each clause and convert.

文献（３）の方法は、左端から総当り的に文節の切り
出しを行い、その中から文節数が最小となる組合せを選
び出して変換結果とする方法である。The method of Document (3) is a method in which a phrase is cut out from the left end in a brute force manner, and a combination that minimizes the number of phrases is selected from the cutout to obtain a conversion result.

べた書き入力では、文節と文節の間の切れ目をどこに
するかという選択の余地があるために、文節分り書きの
場合にくらべて、一般に変換の候補となる多義が多く生
ずる。たとえば、「すうがくかいせきじようでは」とい
う入力に対しては「数学解析上では」，「数学科移籍上
では」，「数学会席上では」，「数学が遺跡上では」な
ど多くの解釈が可能である。したがつて、文献（１），
（２），（３）に述べられている方法を用いても高い変
換精度を得るのは難しいという問題があつた。In solid input, since there is room for selection of where to make a break between phrases, there are generally more polysemy candidates for conversion than in the case of phrase division writing. For example, there are many interpretations of the input, "On mathematics analysis,""On mathematics transfer,""Onmathematics,""Mathematics on archeological sites," and so on. It is possible. Therefore, reference (1),
There is a problem that it is difficult to obtain high conversion accuracy even by using the methods described in (2) and (3).

[Object of the invention]

本発明の目的は変換の確からしさの尤度が高い複数の
変換候補を効率よく抽出，保持し，正しい変換結果をそ
の中から容易に速く選択，確定するべた書き入力向きの
手段を提供することにある。また本発明の第２の目的
は、変換の精度を上げるためのべた書き入力向きの学習
機構を提供することにある。SUMMARY OF THE INVENTION It is an object of the present invention to provide a means for solid writing input that efficiently extracts and holds a plurality of conversion candidates having a high likelihood of conversion certainty, and easily and quickly selects and determines a correct conversion result from the conversion results. It is in. It is a second object of the present invention to provide a learning mechanism for solid writing input for improving the accuracy of conversion.

[Summary of the Invention]

上記目的を達成するため、変換は入力文字列の左端か
ら右方向に、変換候補を切り出し、尤度を求めて、その
尤度がある範囲内にあるものを保持しながら処理を進め
る。入力文字列の右端まで処理が終つたら、変換候補の
中から最も尤度の高い候補列を選び出し変換結果として
表示する。それと同時に、候補列の左から順に保持して
いる他の候補群を表示して、選択，修正を可能とする。
表示している変換結果と異なる候補が選択され、確定し
た場合は、残りの未確定部分について尤度の再評価を行
い、必要があれば変換結果の表示を変更する。選択され
た複合語や接辞付きの自立語は保持しておき、以後同じ
文字列が入力されたときは、尤度を高くして変換精度の
向上をはかる。In order to achieve the above object, in the conversion, a conversion candidate is cut out from the left end of the input character string to the right, a likelihood is obtained, and processing is performed while holding the likelihood within a certain range. When the processing is completed up to the right end of the input character string, a candidate string having the highest likelihood is selected from the conversion candidates and displayed as a conversion result. At the same time, other candidate groups held in order from the left of the candidate column are displayed, so that selection and correction are possible.
If a candidate different from the displayed conversion result is selected and determined, the likelihood is re-evaluated for the remaining undetermined portion, and if necessary, the display of the conversion result is changed. The selected compound words and independent words with affixes are retained, and when the same character string is subsequently input, the likelihood is increased to improve the conversion accuracy.

このように、可能性の高い候補を選んで保持しておく
ことにより、変換語りが生じても高速に別の候補を表示
選択することが可能となり、そのために必要な記憶装置
の量も少なくて済む。また変換誤りを別の候補を選択す
ることにより修正した場合は、その部分だけでなく、そ
れ以後の候補も再評価するので、たとえば、候補間の区
切りの位置が変つた場合は、自動的に新しい区切り位置
から始まる後続の候補列が準備され、以後の補正の手数
を減少することが可能となる。さらに、変換誤り部分に
ついても、以後学習が行われるので同じ誤りを繰返すこ
とはない。特に、接頭語や接尾語のついた語や、複合語
は長い単位で学習されるので、変換精度向上の効果が大
きい。In this way, by selecting and holding a candidate having a high possibility, it becomes possible to display and select another candidate at high speed even when a converted narration occurs, and the amount of storage device required for that is small. I'm done. If the conversion error is corrected by selecting another candidate, not only that part but also the following candidates are re-evaluated. For example, if the position of the break between candidates changes, Subsequent candidate columns starting from the new delimiter position are prepared, so that the number of subsequent correction steps can be reduced. Furthermore, the same error is not repeated for the conversion error part since learning is performed thereafter. In particular, since words with prefixes and suffixes and compound words are learned in long units, the effect of improving conversion accuracy is great.

(Example of the invention)

以下、本発明の一実施例を説明する。 Hereinafter, an embodiment of the present invention will be described.

第１図は本発明の仮名漢字変換装置の構成を表わす図
で、入出力部1,制御部２および記憶部３から成る。入出
力部１は仮名入力および変換、選択指示を入力するため
のキーボードと変換結果および選択候補を表示するため
の表示部とからなる。制御部２は仮名漢字変換の実行制
御を司る。記憶部３は入力仮名文字列や変換で用いられ
るデータを一時保持する。FIG. 1 is a diagram showing the configuration of a kana-kanji conversion device of the present invention, which comprises an input / output unit 1, a control unit 2, and a storage unit 3. The input / output unit 1 includes a keyboard for inputting kana input, conversion and selection instructions, and a display unit for displaying conversion results and selection candidates. The control unit 2 controls execution of kana-kanji conversion. The storage unit 3 temporarily stores an input kana character string and data used for conversion.

本仮名漢字変換装置を用いた変換の実行方法を第２図
に示す。ユーザは入出力部１から仮名文字列を入力し変
換を指示する（10）。制御部２はこの仮名入力に対して
変換の確からしさの尤度が高い候補を作成し記憶部３に
格納する変換処理（20）を行う。次に制御部２は、記憶
部３にある候補の中から最も尤度の高い候補列と先頭の
候補に対応して選択すべき代替候補を作成し（30）、入
出力部１に表示させる（40）。ユーザが、選択候補の中
から目的の候補を選択する（50）と、制御部２は、選択
された部分を確定し、選択された候補の学習を行う（6
0）。次に制御部２は、未確定部分について再び表示候
補作成処理（30）を行い、後続部分の変換結果と代替候
補を作成し表示する。以上の表示（40），選択（50），
確定処理（60），表示候補作成処理（30）を変換結果が
すべて確定されるまで繰返す。FIG. 2 shows a method of executing conversion using the present kana-kanji conversion device. The user inputs a kana character string from the input / output unit 1 and instructs conversion (10). The control unit 2 performs a conversion process (20) of creating a candidate having a high likelihood of conversion certainty and storing the candidate in the storage unit 3 with respect to the kana input. Next, the control unit 2 creates an alternative candidate to be selected corresponding to the candidate row having the highest likelihood and the first candidate from the candidates in the storage unit 3 (30), and causes the input / output unit 1 to display the alternative candidate. (40). When the user selects a target candidate from the selection candidates (50), the control unit 2 determines the selected part and learns the selected candidate (6).
0). Next, the control unit 2 performs a display candidate creation process (30) again on the undetermined portion, and creates and displays a conversion result and a replacement candidate for the subsequent portion. The above display (40), selection (50),
The determination processing (60) and the display candidate creation processing (30) are repeated until all the conversion results are determined.

以下、上で述べた制御部２の動作を更に詳しく説明す
る。Hereinafter, the operation of the control unit 2 described above will be described in more detail.

第３図は変換処理（20）の動作を表わすフローチヤー
ト図である。第４図は変換処理（20）によつて記憶部３
上に作られるデータの一例を示す図である。第４図に示
した例を用いて、第３図の変換処理（20）を具体的に説
明する。FIG. 3 is a flowchart showing the operation of the conversion process (20). FIG. 4 shows the storage unit 3 according to the conversion processing (20).
It is a figure showing an example of data created above. The conversion process (20) in FIG. 3 will be specifically described using the example shown in FIG.

今、入力文字列が「すうがくかいせきじようでは」で
あつたとする。第３図、および第４図において、ｎは入
力文字列の先頭から数えた文字の位置を示すポインタと
する。第３図のステツプ201でまずｎを１にセツトし入
力文字列の先頭に位置づける。ステツプ202でｎの位置
が終了している文節端があるかどうかチエツクする。文
字列の先頭は特別に文節端とみなすとする。ステツプ20
3に移つて、文節変換を行う。文節変換では、ｎの位置
を文節の先頭として１文節分の変換を行い可能な変換候
補を抽出する。ここで文節とは１つの自立語とその前に
省略可能な接頭語，自立語の語に省略可能な接尾語、お
よび省略可能な付属語が連なつた形式のものをさす。複
合語は文節とみなす。第４図の例では、「数学」，
「数」，「吸う」，「数学会」，「数学界」，「数学
階」，「数が」，「吸うが」等が先頭から変換候補とし
て抽出される。この文節単位の仮名漢字変換としては、
NHK技術研究第25巻第５号昭和48年５月「計算機による
カナ漢字変換」に示されている方法がよく知られてい
る。次にステツプ204で変換候補の確からしさの尤度を
判定し、記憶部３に保持すべきか、捨てる（枝刈りと呼
ぶ）べきか決める。確からしさの尤度は、基本的には文
献（１），（２），（３）に述べられているように、入
力文字列をより長い文節の列、別の言い方をすればより
少ない文節の列に分解する方が尤度が高くなるように決
める。ただし同じ文節では「この」「その」などの連体
詞や、「こと」，「もの」などの形式名詞などは他の文
節に付属して使用されることが多いので名詞や動詞のよ
うに独立した一つの文節とみなさず、文節数を数える場
合１より小さな値とする。このように、品詞および出現
頻度等を考慮した重みを掛けて文節数を求め、その数が
少ない程尤度が高いとする。具体的には、名詞，動詞，
形容詞，形容動詞には重み1,形式名詞，補助動詞，連体
詞等には0.1,接頭語，接尾語は準自立語扱いとして0.5
の重みを与えるものとする。Now, it is assumed that the input character string is "in the world". 3 and 4, n is a pointer indicating the position of the character counted from the head of the input character string. In step 201 of FIG. 3, n is first set to 1 and the n is set at the head of the input character string. At step 202, it is checked whether there is a segment end where the position of n ends. It is assumed that the beginning of the character string is specially regarded as a clause end. Step 20
Go to step 3 to perform phrase conversion. In the phrase conversion, a conversion candidate capable of performing one phrase conversion is extracted with the position of n as the head of the phrase. Here, a phrase refers to a form in which one independent word is preceded by an optional prefix, an independent word, an optional suffix, and an optional appendix. Compound words are considered clauses. In the example of FIG.
"Number", "suck", "mathematical society", "mathematics world", "mathematics floor", "number", "suck" are extracted as conversion candidates from the top. The kana-kanji conversion for each phrase is
The method shown in NHK Technical Research Vol. 25, No. 5, May 1973, "Kana-Kanji Conversion by Computer" is well known. Next, in step 204, the likelihood of the conversion candidate is determined, and it is determined whether the conversion candidate should be stored in the storage unit 3 or discarded (called pruning). The likelihood of the certainty is basically, as described in the literatures (1), (2), and (3), that the input character string is a sequence of longer phrases, or in other words, fewer phrases. Is determined so that the likelihood is higher when the image is decomposed into columns. However, in the same phrase, adnominals such as "this" and "that" and formal nouns such as "koto" and "mono" are often attached to other phrases, so they are independent like nouns and verbs. When counting the number of phrases without considering it as one phrase, the value is set to a value smaller than 1. As described above, the number of phrases is obtained by multiplying the weight in consideration of the part of speech and the appearance frequency, and the smaller the number is, the higher the likelihood is. Specifically, nouns, verbs,
Weight 1 for adjectives and adjective verbs, 0.1 for formal nouns, auxiliary verbs, adverbs, etc., 0.5 for prefixes and suffixes as semi-independent words
Weight.

枝刈りは、文字列先頭から現在判定の対象となつてい
る文節の後端までの尤度を求め、それをその文節後端文
字位置における尤度と定めて、もし既に同じ文字位置に
おいて尤度が求まつている場合はその値と比較し、その
値がある許容値を越える場合枝刈りを実行する。本実施
例では、この許容値は同じ文字位置における重みつき文
節数の最小値＋１である。しかし本発明は、確からしさ
の尤度の決め方、および枝刈りの許容値の大きさによつ
て制限されるものではないことは言うまでもない。本発
明の特徴は、確からしさの尤度がある範囲内にある文節
候補をすべて抽出し、記憶部３上に保持することによ
り、後の選択，修正を高速に容易に行なえる点にある。Pruning calculates the likelihood from the beginning of the character string to the end of the phrase that is currently the subject of the determination, and defines it as the likelihood at the character position at the end of the phrase. Is found, it is compared with that value, and if that value exceeds a certain allowable value, pruning is executed. In this embodiment, the allowable value is the minimum value of the number of weighted phrases at the same character position + 1. However, it goes without saying that the present invention is not limited by how to determine the likelihood of the certainty and the size of the allowable value of pruning. A feature of the present invention is that by extracting all phrase candidates within a certain range of likelihood of certainty and storing them in the storage unit 3, later selection and correction can be easily performed at high speed.

第４図において、先頭より抽出された各文節はすべて
記憶部３に格納される（ステツプ205）。格納する場
合、データを第４図に示したようなネツトワーク状にす
ることにより占有する記憶容量を小さくすることができ
る。たとえば、「数」と「吸う」は文節右端を共有する
ことにより、その後続の文節を一元化することが可能で
ある。このようなデータ表現の具体的方法はリスト処理
としてよく知られている。In FIG. 4, all the clauses extracted from the head are stored in the storage unit 3 (step 205). When storing the data, the data can be formed into a network as shown in FIG. 4 so that the occupied storage capacity can be reduced. For example, “number” and “suck” can share the right end of a phrase, thereby unifying subsequent phrases. A specific method of expressing such data is well known as list processing.

第３図ステツプ206ではｎに１を加えてポインタを次
の入力文字位置に進める。ステツプ207でポインタｎの
文字位置に入力がまだあるかどうか判定し、あればステ
ツプ202に戻る。At step 206 in FIG. 3, 1 is added to n to advance the pointer to the next input character position. At step 207, it is determined whether or not an input is still present at the character position of the pointer n, and if there is, the process returns to step 202.

ｎが２のとき本文字位置に文節右端はないので、ステ
ツプ206に行き、直ちに次の文字位置にポインタｎを移
す。ｎが３の場合は、「数」，「吸う」という文節端が
可能であるので、ステツプ203で本位置から次の文節の
抽出を行い、「額」，「楽」，「額か」，「額会」，
「額科」などの文節が得られる。ステツプ204でｎが４
の位置ですでに「数学」が候補として得られており、そ
の重みつき文節数は１である。「額」，「楽」の文節候
補ではｎが４の位置の重みつき文節数は「数」の１と
「額」または「楽」の１を加えて２となる。枝刈り条件
から「額」，「楽」は許容範囲内におさまるのでステツ
プ205でネツトワークに登録されることになる。If n is 2, there is no right end of the phrase at this character position, so go to step 206 and immediately move the pointer n to the next character position. If n is 3, the end of the phrase "number" or "suck" is possible, so the next phrase is extracted from this position in step 203, and "forehead", "easy", "forehead", "Framework",
A phrase such as "forehead" can be obtained. N is 4 in step 204
"Mathematical" has already been obtained as a candidate at the position of, and the number of weighted phrases is one. In the phrase candidates of “amount” and “easy”, the number of weighted phrases at the position where n is 4 is 2 by adding 1 of “number” and 1 of “amount” or “easy”. From the pruning conditions, “forehead” and “easy” fall within the allowable range, and are registered in the network at step 205.

以上述べたように、ステツプ202からステツプ207をｎ
が14になるまで繰返すことにより第４図に示した完全な
ネツトワークが得られる。ただし、ここで「会，界，
階，科，化，上，場，状」は接尾語とする。As described above, steps 207 to 207 are replaced by n.
Is repeated until 14 is obtained, whereby the complete network shown in FIG. 4 is obtained. However, here, "kai, world,
“Floor, family, chemical, up, place, state” is a suffix.

ステツプ208では、変換結果の未確定部分の先頭の位
置を表わすポインタＰを１に初期設定し、まだ確定部分
がないことを示しておく。In step 208, the pointer P indicating the head position of the undetermined portion of the conversion result is initialized to 1, indicating that there is no determined portion yet.

以上により制御部２における変換処理（20）の動作が
終了し、次に表示候補作成処理（30）の動作に移る。Thus, the operation of the conversion process (20) in the control unit 2 is completed, and then the operation proceeds to the operation of the display candidate creation process (30).

第５図は表示候補作成処理（30）の動作を表わすフロ
ーチヤート図である。FIG. 5 is a flowchart showing the operation of the display candidate creation process (30).

ステツプ301では、Ｐ点より入力文字列終端までの候
補列の中から尤度の尤も大きな候補列を変換結果として
表示バツフアにセツトする。もしも尤度の同じ候補列が
複数ある場合は、記憶部３上にあつて、既に確定された
文字列から複合語や接辞付きの自立語を保持している学
習テーブルを参照し、このテーブル上にある語を最も多
く含む候補列を変換結果として選択する。また以上でも
一意に決らない場合は、自立語長の和の最も大き候補列
を選択する。この他に、変換結果を選ぶ方法としては、
単語の頻度を用いる方法等もあり、以上述べた要因を任
意に組合せて別の選択方法を作ることができる。In step 301, a candidate string having a large likelihood from among the candidate strings from the point P to the end of the input character string is set in the display buffer as a conversion result. If there are a plurality of candidate strings having the same likelihood, a learning table which stores compound words and independent words with affixes from the already determined character strings is referred to in the storage unit 3 and is referred to in this table. Is selected as the conversion result. If the above is not determined uniquely, the candidate string having the largest sum of the independent word lengths is selected. Another way to select the conversion result is
There is also a method using the frequency of words, etc., and another selection method can be created by arbitrarily combining the factors described above.

ステツプ302では、Ｐ点から始まる他の候補群を表示
バツフアにセツトしユーザが選択できるようにする。In step 302, another candidate group starting from point P is set in the display buffer so that the user can select it.

第４図の例の場合、Ｐが１のとき変換結果として「数
学解析上では」が選ばれ、選択のための候補群として、
「数学」，「数」，「吸う」が取出され表示バツフアに
セツトされる。さらに、現在選択の対象となつている文
字列部分を明示するため、変換結果、および入力文字列
の該当部分が強調表示される。（表示（40））。In the case of the example shown in FIG. 4, when P is 1, “on mathematical analysis” is selected as a conversion result, and as a candidate group for selection,
"Mathematical", "number" and "suck" are extracted and set in the display buffer. Further, the conversion result and the corresponding portion of the input character string are highlighted to clearly indicate the character string portion currently being selected. (Display (40)).

第７図は入力（10）が終つた直後の入出力部１に表示
される画面情報を、第８図は表示候補作成列理（30）が
終つた後表示される画面情報を示す。本画面で最下部は
入力行、上部は変換結果出力行、右下は選択候補表示エ
リアである。FIG. 7 shows screen information displayed on the input / output unit 1 immediately after the input (10) is completed, and FIG. 8 shows screen information displayed after the display candidate creation process (30) is completed. In this screen, the bottom is an input line, the top is a conversion result output line, and the bottom right is a selection candidate display area.

ユーザが画面上の変換結果を見て選択（50）を行う。
選択は選択候補群の中から目的の１つを選ぶ操作をキー
ボードにより指示するが、画面上部に強調表示されてい
る変換結果が正しい場合はある特定のキーを押下するこ
とにより選ぶことも可能である。The user looks at the conversion result on the screen and makes a selection (50).
The selection is performed by using a keyboard to select an object from the selection candidate group. If the conversion result highlighted at the top of the screen is correct, it can be selected by pressing a specific key. is there.

制御部２はユーザにより選択（50）が行われると、次
に確定処理（60）の動作を実行する。When the selection (50) is performed by the user, the control unit 2 next executes the operation of the confirmation process (60).

第６図は確定処理（60）の動作を表わすフローチヤー
ト図である。FIG. 6 is a flowchart showing the operation of the determination process (60).

ステツプ601では、ユーザが選択指示した候補に対応
する記憶部３上のネツトワークのデータにマークを付け
るとともに、Ｐに選択された候補の長さを加えて、後続
の未確定部の先頭位置を示す。In step 601, a mark is added to the network data in the storage unit 3 corresponding to the candidate selected and instructed by the user, the length of the selected candidate is added to P, and the start position of the subsequent undecided part is determined. Show.

ステツプ602では、選択された候補を記憶部３上の学
習テーブルに読みと共に格納する。In step 602, the selected candidate is read and stored in the learning table on the storage unit 3.

制御部２は表示候補作成処理30により確定処理60で選
択された候補および、Ｐ点以降の候補のネツトワークの
尤度を再評価して得られる後続の変換結果と、Ｐ点から
始まる他の候補群を表示バツフアにセツトする。第９図
にユーザが「数学」を選択した後の画面表示情報を示
す。強調表示は「数学」の後の「解析」の部分に移つて
いる。選択候補群としては「解析」，「会」，「界」，
「階」「科」，「化」，「か」が表示される。The control unit 2 re-evaluates the likelihood of the network of the candidate selected in the determination process 60 by the display candidate creation process 30 and the candidate after the point P, and the other conversion results obtained from the point P and other conversion results. Set the candidate group in the display buffer. FIG. 9 shows the screen display information after the user selects "math". The highlighting has moved to the "Analysis" section after "Mathematics". The candidate groups are “analysis”, “meeting”, “world”,
The “floor”, “family”, “ka”, and “ka” are displayed.

ユーザが、候補の中から「解析」ではなく「会」を選
択すると、第10図に示すように、「会」の後の変換結果
が第４図のネツトワークから得られる「席上」に自動的
に変更され、選択の候補群として「席上」，「関」，
「咳」が表示される。このように候補間の隣接関係を保
持しているため、文節切れ目が選択（50）によつて変更
された場合は、後続の表示も同時に修正することが可能
であり、以後の修正の工数を減らすことができる。When the user selects “meeting” instead of “analysis” from among the candidates, as shown in FIG. 10, the conversion result after the “meeting” is displayed on “seat” obtained from the network of FIG. It will be changed automatically, and the selection candidates will be “Seat”, “Seki”,
"Cough" is displayed. Since the adjacency between candidates is maintained in this way, when a segment break is changed by selection (50), the subsequent display can be corrected at the same time, and the man-hours for subsequent corrections can be reduced. Can be reduced.

ユーザが「会」を選択したことによつて、確定処理
（60）で学習テーブルも更新されるが、今の場合、すで
に「数学」が登録されており、「会」が接尾語であるこ
とから、以前に登録されている「数学」に「会」が追加
され「数学会」の形で再登録が行われる。第11図に「数
学会」が登録された後のテーブルの内容を示す。The learning table is also updated in the confirmation process (60) when the user selects "kai", but in this case, "math" has already been registered and "kai" is a suffix. From this, "Math" is added to the previously registered "Mathematics", and re-registration is performed in the form of "Mathematical Society". FIG. 11 shows the contents of the table after "Mathematical Society" has been registered.

制御部２は未確定部がなくなるまで表示候補作成処理
（30）と確定処理（60）を繰返して実行する。すべての
変換結果が確定されると処理を終了する。The control unit 2 repeatedly executes the display candidate creation processing (30) and the determination processing (60) until there is no undetermined part. When all the conversion results have been determined, the process ends.

以上述べた一連の処理が終了すると記憶部３上のネツ
トワークのデータは消去されるが、学習テーブル上のデ
ータは保持され、以後同じ文字列が入力された場合は優
先的に変換結果として採用されるので、使用に応じて変
換精度を高めることができる。従来も、変換に伴う学習
機能は用いられていたが、接頭語や接尾語はそれ単独で
学習しても、同音意義のもの同志が同一文章中に現われ
ることが多いので、かえつて変換精度を落すことが多く
問題であつた。また「新聞記者が汽車で」などのよう
に、同音異義語が現われる場合は「きしや」という読み
の単位で学習していたので、「記者」と「汽車」を区別
することができなかつた。本発明によれば、接頭語や接
尾語はそれが付く自立後と共に学習し、複合語は自立語
に分解せず長い単位で学習するので、学習による精度の
向上をより高くできる効果がある。When the above-described series of processing is completed, the network data in the storage unit 3 is deleted, but the data in the learning table is retained. If the same character string is input thereafter, the data is preferentially adopted as the conversion result. Therefore, the conversion accuracy can be increased depending on the use. In the past, the learning function associated with conversion was used, but even if prefixes and suffixes were learned by themselves, comrades with the same sound meaning often appeared in the same sentence. Dropping was often a problem. Also, when a homonym appears, such as "news reporter is on a train", it was learned in units of reading "kisiya", so it was not possible to distinguish between "reporter" and "train" Was. According to the present invention, since the prefix and the suffix are learned together with the independence to which they are attached, and the compound word is learned in a long unit without being decomposed into the independent word, there is an effect that the accuracy can be further improved by learning.

〔The invention's effect〕

本発明によれば、変換の確からしさの尤度が高い候補
を保持しているので、もし変換結果に誤りがあつても直
ちに別の候補を選択することができ修正を高速化できる
効果がある。候補の作成保持のための処理をユーザの入
力作業中に並行して行うことにより、更にユーザの変換
待ち時間を短くすることが可能である。また複合語，接
辞付き自立誤の学習機能により変換精度を向上させるこ
とができる。According to the present invention, since a candidate having a high likelihood of conversion certainty is held, another candidate can be immediately selected even if there is an error in the conversion result, and the correction can be speeded up. . By performing the processing for creating and holding the candidates in parallel during the input operation of the user, it is possible to further reduce the conversion waiting time of the user. In addition, the conversion accuracy can be improved by a compound word and an affixed self-sufficient learning function.

[Brief description of the drawings]

第１図は本発明による仮名漢字変換装置の構成図、第２
図は変換の実行方法を示す図、第３図は変換処理の動作
を表わすフローチヤート図、第４図はデータの一例を示
す図、第５図は表示候補作成処理、第６図は確定処理の
動作を表わすフローチヤート図、第７図から第10図は画
面表示情報を表わす図、第11図は学習テーブルの内容の
例を表わす図である。１……入出力部、２……制御部、３……記憶部。FIG. 1 is a block diagram of a kana-kanji conversion device according to the present invention, and FIG.
FIG. 3 is a diagram showing a method of executing the conversion, FIG. 3 is a flowchart showing an operation of the conversion process, FIG. 4 is a diagram showing an example of data, FIG. 5 is a display candidate creation process, and FIG. FIGS. 7 to 10 are diagrams showing screen display information, and FIG. 11 is a diagram showing an example of the contents of a learning table. 1 ... input / output unit, 2 ... control unit, 3 ... storage unit.

───────────────────────────────────────────────────── フロントページの続き (56)参考文献特開昭57−14971（ＪＰ，Ａ) 特開昭58−115528（ＪＰ，Ａ) 特開昭56−17468（ＪＰ，Ａ) 特開昭58−129633（ＪＰ，Ａ) 特開昭58−19963（ＪＰ，Ａ) 特開昭59−121529（ＪＰ，Ａ) ──────────────────────────────────────────────────続き Continuation of the front page (56) References JP-A-57-14971 (JP, A) JP-A-58-115528 (JP, A) JP-A-56-17468 (JP, A) 129633 (JP, A) JP-A-58-19963 (JP, A) JP-A-59-121529 (JP, A)

Claims

(57) [Claims]

1. A character string input means for inputting a kana character string which includes a plurality of phrases and is not separated in units of phrases, a storage means for storing the input kana character string, and a kanji character A conversion instructing unit for converting kana-kanji into a kana-mixed sentence; and a plurality of phrases cut out from the kana character string by the kana-kanji conversion instruction, and at least one conversion candidate starting with the first character of each phrase. A conversion means for extracting kana-kanji conversion by extracting the candidate kana-kanji character, and a display means for displaying a candidate character string composed of a conversion candidate having the highest probability for each phrase among the extracted conversion candidates. A phrase break indicating means for reselecting a conversion candidate constituting the candidate character string and indicating a break in a phrase of the input kana character string; and the kana sentence stored in the storage means. In the column, control means for causing the display means to display a conversion candidate obtained by re-evaluating the likelihood of conversion for all undetermined parts subsequent to the break of the indicated phrase, and Kana-kanji conversion device.

2. The kana-kanji conversion apparatus according to claim 1, wherein said storage means stores, together with said kana character string, a conversion candidate having a likelihood of certainty higher than a predetermined value among said conversion candidates.

3. The kana-kanji conversion device according to claim 2, wherein the likelihood is determined by the number of weighted components determined for each part of speech category of the components in the phrase.

4. The storage device according to claim 2, wherein said storage means stores said conversion candidates in the form of a network.
The kana-kanji conversion device described in section.