JPS58225428A

JPS58225428A - Kana (japanese syllabary)-kanji (chinese character) converting system

Info

Publication number: JPS58225428A
Application number: JP57109395A
Authority: JP
Inventors: Hirokawa Hayashi; 林　大川; Yoshitoshi Yamauchi; 佐敏山内; Tetsuya Ishikawa; 徹也石川
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 1982-06-25
Filing date: 1982-06-25
Publication date: 1983-12-27

Abstract

PURPOSE:To reduce the load of an operator and at the same time to facilitate easy handling of a KANA-KANJI converter, by decomposing the KANA letters into words for each input paragraph to display the candidates of conversion and displaying the second candidate if the first candidate character train is not equal to a desired one. CONSTITUTION:A KANA sentence fed to an input preprocessing part 1 is transferred to a unit extracting part 2 after extracting a KANA character train to be converted under the control of a conversion control part 3. The KANA sentence is received at a storage part 2A of the part 2 and then sent to a word extraction control part 2B to be decomposed into words. These words are collated with the contents stored in a conversion dictionary storage part 2E through a dictionary retrieval control part 2D. The detected candidate of conversion is stored in a candidate word storage part 2. If the conversion character train desired by an operator does not exist among decided candidates of conversion, a new (independent word + remaining KANA character train) is displayed at a new stage. Thus the operator can select the new character train. Thereafter the same operation is repeated.

Description

【発明の詳細な説明】本発明は邦文ワードプロセッサ等におけるカナ漢字変換
処理方式に関し、特に文節区切り情報を与えるカナ漢字
変換処理方式における文書作成作業の処理速度を向上さ
せ得るカナ漢字変換処理方式に関する。DETAILED DESCRIPTION OF THE INVENTION The present invention relates to a kana-kanji conversion processing method in a Japanese word processor, etc., and more particularly to a kana-kanji conversion processing method that can improve the processing speed of document creation work in a kana-kanji conversion processing method that provides bunsetsu separation information.

カナ漢字変換（以下、単に「変換」ともいう。）処理方
式に関しては従来から種々の方式が提案されている。従
来の変換方式においては、文節指定。Various methods have been proposed for kana-kanji conversion (hereinafter also simply referred to as "conversion") processing methods. In the conventional conversion method, clause specification is used.

単語指定等の差はあっても、一般には、入力仮名文字列
に対する可能な変換文字列を全て探索して変換結果を出
力表示するものであった。そして、同音異字語（以下「
同音語」という。）がある場合には最有力候補から変換
文字列を表示してオペレータに選択、ｓｖｇを求めオペ
レータの意図しない変換文字列のときは次候補キーを押
すことにより、別の変換文字列が替って表、示される。Although there are differences in word specifications, etc., in general, all possible conversion character strings for an input kana character string are searched and the conversion results are output and displayed. and homophones (hereinafter “
It is called a homonym. ), display the conversion string from the most likely candidates and let the operator select it, ask for the svg, and if the conversion string is not the operator's intention, press the next candidate key and another conversion string will be replaced. will be displayed in the table.

この過程でオペレータの意図したものがあればオペレー
タが選択キーを押すことにより変換文字列が確定すると
いうものであった。During this process, if the operator finds what he or she intended, the operator presses a selection key to confirm the converted character string.

しかしながら、上述の方式において、オペレータが文節
分かち書き単位として、変換辞書からみて複数の単語か
ら構成される語の読みを入力した場合を考えると、前記
読みを複数の単語に区切る組合わせ方は単一とは限らず
、また、区切った単語にも同音語がある場合もあり、変
換辞書にある単語を単純に組合わせた変換候補の数は一
般に非常に大きな数となってしまい、この変換候補を出
力表示しオペレータに選択させることは、オペレータに
とってきわめて大きな負担になるという問題がある。However, in the above method, if the operator inputs the pronunciation of a word consisting of multiple words as seen from the conversion dictionary as a bunsetsu segmentation unit, the combination of dividing the pronunciation into multiple words is unique. In addition, the separated words may also have homophones, and the number of conversion candidates simply combining words in the conversion dictionary is generally very large. There is a problem in that displaying the output and having the operator make a selection places an extremely heavy burden on the operator.

これに対しては、従来、前記分かち書き単位として複数
の単語を許さ゛ない−すなわち、主になる変換辞書の単
語の１分かち書き単位に１個に限定する一方式が提案さ
れているが、この方式においては、オペレータが変換辞
書の内容・特性を良く知った上で分かち書きを行わなけ
ればならないという煩わしさがあり、場合によっては、
分かち書きの誤りを生ずるという問題もあった。To deal with this, a method has conventionally been proposed in which multiple words are not allowed as the division unit - in other words, it is limited to one word per division unit of the main conversion dictionary, but this method In this case, it is troublesome that the operator has to know the contents and characteristics of the conversion dictionary well before writing the separation, and in some cases,
There was also the problem of errors in parting.

本発明は上記事情に艦みてなされたもので、その目的と
するところは、従来の変換方式における上述の如き問題
を解消し、変換候補を実用レベルまで減らしてオペレー
タの負担を軽減するとともに使い勝手を良くすることが
可能なカナ漢字変換処理方式を提供することにある。The present invention was made in view of the above circumstances, and its purpose is to solve the above-mentioned problems with conventional conversion methods, reduce the number of conversion candidates to a practical level, reduce the burden on the operator, and improve usability. The object of the present invention is to provide a kana-kanji conversion processing method that can improve performance.

本発明の上記目的は、入力文を、自立語を中心とする分
かち書き単位に分解して仮名文字で入力し、これに対応
する漢字カナ混じり文を逐次得るカナ漢字変換処理方式
において、文節単位の入力仮名文字列を単語に分解する
手段と、分解された各単語に対する変換候補を前記入力
仮名文字列の先頭の単語から順に表示する手段と、変換
候補がオペレータの意図しない文字列であるとき次点と
なる候補を表示する手段とを設けて、オペレータの意図
する出力文字列が前記分解手段によって複数の単語に分
解された場合に、前記入力仮名文字列の先頭の単語から
順にオベレ、−夕が意図する変換文字列を決定して行く
ことを特徴とするカナ漢字変換処理方式によって達成さ
れる。The above-mentioned object of the present invention is to provide a kana-kanji conversion processing method in which an input sentence is broken down into division units centered on independent words and input as kana characters, and corresponding sentences containing kanji and kana are sequentially obtained. means for decomposing an input kana character string into words; means for displaying conversion candidates for each decomposed word in order from the first word of the input kana character string; means for displaying point candidates, so that when the output character string intended by the operator is decomposed into a plurality of words by the decomposition means, the input kana character string is sequentially displayed from the first word to Obere, -Yu, and so on. This is achieved by a kana-kanji conversion processing method characterized by determining the intended conversion character string.

以下、本発明の実施例を図面に基づいて詳細に説明する
。　　　　　　　　　　　　　　　　　　　　　（１第
１図は本発明の一実施例であるカナ漢字変換処理のプル
ツク図である。図において、ｌは入力前処理部、２は単
語抽出部、３は同音語判別部、４は出力制御部そして５
は変換制御部である。Embodiments of the present invention will be described in detail below with reference to the drawings. (1) Figure 1 is a pull diagram of the kana-kanji conversion process which is an embodiment of the present invention. control section and 5
is a conversion control unit.

日本語文が仮名文で入力されると、以下の如き処理を経
て漢字カナ混じり文として出力される。When a Japanese sentence is input as a kana sentence, it is output as a mixed kanji/kana sentence through the following processing.

入力前処理部１は、入力仮名文中の英数字、文節区切り
情報等をｗｇ１１ｉＩｌシて変換対象となる仮名文字列
を抽出し、変換制御部δの制御の下に単語抽出部２に、
変換単位となる仮名文字列を渡す。単語抽出部２は前記
仮名文字列から、カナを見出しとする単語辞書および該
単語辞書に付加されている語の品詞情報、品詞別の接続
情報を納めた辞書等を参照し、前記仮名文字列と辞書見
出しとの一致を試み、文法的に許容される単語列の候補
を抽出する。同音語判別部３は、上記単語列の候補が複
数個存在する場合に、単語の持つ頻度情報等を用いて最
有力単語列を決定する。出力制御部４は、上述の如く決
定された文節の対応文字を出力表示装置に表示する。そ
の後、入力前処理部１が制御キーの次候補を指示する制
御信号を受けると、初めに表示した最有力候補の代りに
次点となった候補を表示する。The input preprocessing unit 1 extracts a kana character string to be converted by extracting alphanumeric characters, bunsetsu delimiter information, etc. in the input kana sentence, and sends the word extraction unit 2 to the word extraction unit 2 under the control of the conversion control unit δ.
Pass the kana string that is the conversion unit. The word extracting unit 2 extracts the kana character string from the kana character string by referring to a word dictionary with kana as a heading, a dictionary containing word part-of-speech information added to the word dictionary, and connection information for each part of speech. and the dictionary entry to extract grammatically acceptable word string candidates. When there are multiple candidates for the word string, the homophone discrimination unit 3 determines the most likely word string using frequency information of the words. The output control unit 4 displays the corresponding characters of the phrase determined as described above on the output display device. Thereafter, when the input preprocessing section 1 receives a control signal instructing the next candidate for the control key, the second candidate is displayed instead of the most likely candidate that was initially displayed.

以下、本発明の要点である単語抽出部２について説明す
る。The word extraction section 2, which is the main point of the present invention, will be explained below.

単語抽出部２は第２図に示す如く、文節単位仮名１文字
列記憶部２Ａ、単語抽出制御ｓ２Ｂ、候補単語記憶部２
０．辞書検索制御部２Ｄおよび変換辞書記憶部２Ｅから
成っている。文節単位仮名文字列記憶部２人は文節単位
文字列信号を受取り、これを単語抽出制御部２Ｂに送り
、単語抽出制御部２Ｂはこれを単語に分解し辞書検索制
御部２Ｄを通して変換辞書記憶部２Ｅに記憶されている
内容と照合する。この処理により見出された変換候補は
候補単語記憶部２Ｃに格納される。As shown in FIG. 2, the word extraction unit 2 includes a clause unit kana 1 character string storage unit 2A, a word extraction control s2B, and a candidate word storage unit 2.
0. It consists of a dictionary search control section 2D and a conversion dictionary storage section 2E. Clause-based kana character string storage unit The two receive the clause-based character string signal and send it to the word extraction control unit 2B, which breaks it down into words and sends it to the conversion dictionary storage unit through the dictionary search control unit 2D. Check with the contents stored in 2E. The conversion candidates found through this process are stored in the candidate word storage section 2C.

単語抽出部２の上記処理は５、第３図に示したステップ
（段階）の順に行われる。すなわち、文節単位の仮名文
字列から、段階１では最長一致法により自立語を抽出し
、これを基に■〜■の如く単語の組合わせを決定する。The above processing of the word extracting section 2 is performed in the order of 5 and the steps shown in FIG. That is, in step 1, independent words are extracted from the kana character string of each clause by the longest match method, and word combinations such as ① to ③ are determined based on this.

段階ｌで変換候補に当る組合わせが見出されなかった場
合、あるいは、段階１で決定した変換候補にオペレータ
の意図する変換文字列がない場合には段階２へ移る。段
階２では、オペレータに段階２に入ったことを知らせ、
Ｑ〜■の如く、「第１単語（例えば０では「自立語１」
を指す。）十残り仮名文字列」の形態で仮名文字列の先
頭部の単語の組合わせ候補を表示する。ここで、第１嚇
語の変換候補がオペレータの意図しない文字列のとき、
オペレータは制御キーにより段階１の場合と同様に次候
補を表示させることができる。第１単語がオペレータの
意図したものである場合には、オペレータは別の制御キ
ーにより段階３に進むよう指令を発する。段階２で、第
１単飴に当る候補が見出せない場合は、単語抽出部２の
処理は終了する。If no combination matching the conversion candidates is found in step 1, or if the conversion candidates determined in step 1 do not include the converted character string intended by the operator, the process moves to step 2. In stage 2, the operator is notified that stage 2 has been entered;
As in Q~■, "first word (for example, 0 means "independent word 1")
refers to ) Displays possible combinations of words at the beginning of the kana character string in the form of "10 remaining kana character strings". Here, when the conversion candidate for the first threatening word is a character string that is not intended by the operator,
The operator can use the control key to display the next candidate as in stage 1. If the first word is what the operator intended, the operator issues a command to proceed to step 3 with another control key. In step 2, if no candidate corresponding to the first single candy is found, the processing of the word extraction unit 2 ends.

段階３では、第３図の段階２の０〜■における第１単語
を除いた残りの仮名文字列について、段階１と同様に単
語の組合わせの比較を行ってオペレータに選択させる（
自立語２個までの選択）。In step 3, the remaining kana character strings excluding the first word in step 2 of FIG.
(Select up to 2 independent words).

これで、まだ、仮名文字列が残る場合には、単語抽出部
２の処理を終了するか、または残りの仮名文字列に対し
て段階３の操作を綽り返ず。If there are still kana character strings remaining, the process of the word extraction unit 2 is terminated, or the operation in step 3 is not repeated for the remaining kana character strings.

なお、第３図において０”は任意の繰り返しを、また、
〔〕は有っても無くても良いことを示すものであり、矢
印は単語の組合わせ比較の順序を示すものである。In addition, in Fig. 3, 0'' indicates arbitrary repetition, and
[ ] indicates that it may be present or absent, and arrows indicate the order of word combination comparison.

上記実施例において、段Ｆ！／２以降ではすでに変換の
終了した部分と未だ変換の終了していない部分との区別
を容易にするために表示方法を変更しでも良い。In the above embodiment, stage F! After /2, the display method may be changed to make it easier to distinguish between the portions that have already been converted and the portions that have not yet been converted.

以上述べた如く、本発明によれば、入力文を、自立語を
中心とする分かち書き単位に分解して仮名文字で入力し
、これに対応する漢字カナ混じり文を逐次得るカナ漢字
変換処理方式において、文節単位の入力仮名文字列を単
語に分解する手段と、分解された各単語に対する変換候
補を前記入力仮名文字列の先頭の単語から順に表示する
手段と、変換候補がオペレータの意図しない文字列であ
るとき次点となる候補を表示する手段とを設けて、オペ
レータの意図する出力文字列が前記分解手段　　　　１
によって複数の単語に分解された場合に、前記入力仮名
文字列の先頭の単語から順にオペレータが意図する変換
文字列を決定して行くようにしたので、文節分かち書き
単位として複数の自立語を許容しながら、同音語の組合
わせ数を減らすことができ、オペレータの負担を軽減し
たカナ漢字変換処理方式を実現できるという顕著な効果
を秦する。As described above, according to the present invention, in the kana-kanji conversion processing method, an input sentence is broken down into division units centered on independent words and input as kana characters, and corresponding sentences containing kanji and kana are sequentially obtained. , a means for decomposing an input kana character string in units of phrases into words, a means for displaying conversion candidates for each decomposed word in order from the first word of the input kana character string, and a means for displaying conversion candidates for character strings not intended by the operator. means for displaying the runner-up candidate when 1 is the output character string intended by the operator.
When the input kana character string is decomposed into multiple words by However, it has the remarkable effect of reducing the number of homophone combinations and realizing a kana-kanji conversion processing method that reduces the burden on the operator.

[Brief explanation of the drawing]

第１図は本発明の一実施例であるカナ漢字変換処理のブ
ロック図、第２図は単語抽出部の詳細を示す図、第３図
は処理のステップを示す図である。１：入力前処理部、２：単語抽出部、・２Ａ１文節単位
仮名文字列記憶部、２Ｂ＝単語抽出制御部、２Ｃ；候補
単語記憶部、２Ｄ＝辞書検索記憶部、２Ｅ：変換辞書記
憶部、３＝同音語判別部、４＝出力制御部、５：変換制
御部。特許出願人　　株式会社　リ　コ　一代　理　人　　弁理士　磯　村　雅　俊□１３５FIG. 1 is a block diagram of a kana-kanji conversion process according to an embodiment of the present invention, FIG. 2 is a diagram showing details of a word extraction section, and FIG. 3 is a diagram showing steps of the process. 1: Input preprocessing unit, 2: Word extraction unit, 2A1 Clause unit kana character string storage unit, 2B = Word extraction control unit, 2C: Candidate word storage unit, 2D = Dictionary search storage unit, 2E: Conversion dictionary storage unit , 3 = homophone discrimination section, 4 = output control section, 5: conversion control section. Patent applicant: Ricoh Co., Ltd. Patent attorney: Masatoshi Isomura □135

Claims

[Claims]

In the kana-kanji conversion processing method, the input sentence is broken down into division units centered on independent words and input as kana characters, and the corresponding sentences containing kanji and kana are sequentially obtained.The input kana character string in phrase units is converted into words. means for decomposing, means for displaying conversion candidates for each decomposed word in order from the first word of the input kana character string, and displaying a runner-up candidate when the conversion candidate is a character string not intended by the operator. and means for determining a converted character string intended by the operator in order from the first word of the input kana character string when the output character string intended by the operator is decomposed into a plurality of words by the decomposition means. A kana-kanji conversion processing method that is characterized by the ability to go.