JPS58225428A - Kana (japanese syllabary)-kanji (chinese character) converting system - Google Patents

Kana (japanese syllabary)-kanji (chinese character) converting system

Info

Publication number
JPS58225428A
JPS58225428A JP57109395A JP10939582A JPS58225428A JP S58225428 A JPS58225428 A JP S58225428A JP 57109395 A JP57109395 A JP 57109395A JP 10939582 A JP10939582 A JP 10939582A JP S58225428 A JPS58225428 A JP S58225428A
Authority
JP
Japan
Prior art keywords
kana
conversion
word
operator
words
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
JP57109395A
Other languages
Japanese (ja)
Inventor
Hirokawa Hayashi
林 大川
Yoshitoshi Yamauchi
佐敏 山内
Tetsuya Ishikawa
徹也 石川
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ricoh Co Ltd
Original Assignee
Ricoh Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ricoh Co Ltd filed Critical Ricoh Co Ltd
Priority to JP57109395A priority Critical patent/JPS58225428A/en
Publication of JPS58225428A publication Critical patent/JPS58225428A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/53Processing of non-Latin text

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Document Processing Apparatus (AREA)

Abstract

PURPOSE:To reduce the load of an operator and at the same time to facilitate easy handling of a KANA-KANJI converter, by decomposing the KANA letters into words for each input paragraph to display the candidates of conversion and displaying the second candidate if the first candidate character train is not equal to a desired one. CONSTITUTION:A KANA sentence fed to an input preprocessing part 1 is transferred to a unit extracting part 2 after extracting a KANA character train to be converted under the control of a conversion control part 3. The KANA sentence is received at a storage part 2A of the part 2 and then sent to a word extraction control part 2B to be decomposed into words. These words are collated with the contents stored in a conversion dictionary storage part 2E through a dictionary retrieval control part 2D. The detected candidate of conversion is stored in a candidate word storage part 2. If the conversion character train desired by an operator does not exist among decided candidates of conversion, a new (independent word + remaining KANA character train) is displayed at a new stage. Thus the operator can select the new character train. Thereafter the same operation is repeated.

Description

【発明の詳細な説明】 本発明は邦文ワードプロセッサ等におけるカナ漢字変換
処理方式に関し、特に文節区切り情報を与えるカナ漢字
変換処理方式における文書作成作業の処理速度を向上さ
せ得るカナ漢字変換処理方式に関する。
DETAILED DESCRIPTION OF THE INVENTION The present invention relates to a kana-kanji conversion processing method in a Japanese word processor, etc., and more particularly to a kana-kanji conversion processing method that can improve the processing speed of document creation work in a kana-kanji conversion processing method that provides bunsetsu separation information.

カナ漢字変換(以下、単に「変換」ともいう。)処理方
式に関しては従来から種々の方式が提案されている。従
来の変換方式においては、文節指定。
Various methods have been proposed for kana-kanji conversion (hereinafter also simply referred to as "conversion") processing methods. In the conventional conversion method, clause specification is used.

単語指定等の差はあっても、一般には、入力仮名文字列
に対する可能な変換文字列を全て探索して変換結果を出
力表示するものであった。そして、同音異字語(以下「
同音語」という。)がある場合には最有力候補から変換
文字列を表示してオペレータに選択、svgを求めオペ
レータの意図しない変換文字列のときは次候補キーを押
すことにより、別の変換文字列が替って表、示される。
Although there are differences in word specifications, etc., in general, all possible conversion character strings for an input kana character string are searched and the conversion results are output and displayed. and homophones (hereinafter “
It is called a homonym. ), display the conversion string from the most likely candidates and let the operator select it, ask for the svg, and if the conversion string is not the operator's intention, press the next candidate key and another conversion string will be replaced. will be displayed in the table.

この過程でオペレータの意図したものがあればオペレー
タが選択キーを押すことにより変換文字列が確定すると
いうものであった。
During this process, if the operator finds what he or she intended, the operator presses a selection key to confirm the converted character string.

しかしながら、上述の方式において、オペレータが文節
分かち書き単位として、変換辞書からみて複数の単語か
ら構成される語の読みを入力した場合を考えると、前記
読みを複数の単語に区切る組合わせ方は単一とは限らず
、また、区切った単語にも同音語がある場合もあり、変
換辞書にある単語を単純に組合わせた変換候補の数は一
般に非常に大きな数となってしまい、この変換候補を出
力表示しオペレータに選択させることは、オペレータに
とってきわめて大きな負担になるという問題がある。
However, in the above method, if the operator inputs the pronunciation of a word consisting of multiple words as seen from the conversion dictionary as a bunsetsu segmentation unit, the combination of dividing the pronunciation into multiple words is unique. In addition, the separated words may also have homophones, and the number of conversion candidates simply combining words in the conversion dictionary is generally very large. There is a problem in that displaying the output and having the operator make a selection places an extremely heavy burden on the operator.

これに対しては、従来、前記分かち書き単位として複数
の単語を許さ゛ない−すなわち、主になる変換辞書の単
語の1分かち書き単位に1個に限定する一方式が提案さ
れているが、この方式においては、オペレータが変換辞
書の内容・特性を良く知った上で分かち書きを行わなけ
ればならないという煩わしさがあり、場合によっては、
分かち書きの誤りを生ずるという問題もあった。
To deal with this, a method has conventionally been proposed in which multiple words are not allowed as the division unit - in other words, it is limited to one word per division unit of the main conversion dictionary, but this method In this case, it is troublesome that the operator has to know the contents and characteristics of the conversion dictionary well before writing the separation, and in some cases,
There was also the problem of errors in parting.

本発明は上記事情に艦みてなされたもので、その目的と
するところは、従来の変換方式における上述の如き問題
を解消し、変換候補を実用レベルまで減らしてオペレー
タの負担を軽減するとともに使い勝手を良くすることが
可能なカナ漢字変換処理方式を提供することにある。
The present invention was made in view of the above circumstances, and its purpose is to solve the above-mentioned problems with conventional conversion methods, reduce the number of conversion candidates to a practical level, reduce the burden on the operator, and improve usability. The object of the present invention is to provide a kana-kanji conversion processing method that can improve performance.

本発明の上記目的は、入力文を、自立語を中心とする分
かち書き単位に分解して仮名文字で入力し、これに対応
する漢字カナ混じり文を逐次得るカナ漢字変換処理方式
において、文節単位の入力仮名文字列を単語に分解する
手段と、分解された各単語に対する変換候補を前記入力
仮名文字列の先頭の単語から順に表示する手段と、変換
候補がオペレータの意図しない文字列であるとき次点と
なる候補を表示する手段とを設けて、オペレータの意図
する出力文字列が前記分解手段によって複数の単語に分
解された場合に、前記入力仮名文字列の先頭の単語から
順にオベレ、−夕が意図する変換文字列を決定して行く
ことを特徴とするカナ漢字変換処理方式によって達成さ
れる。
The above-mentioned object of the present invention is to provide a kana-kanji conversion processing method in which an input sentence is broken down into division units centered on independent words and input as kana characters, and corresponding sentences containing kanji and kana are sequentially obtained. means for decomposing an input kana character string into words; means for displaying conversion candidates for each decomposed word in order from the first word of the input kana character string; means for displaying point candidates, so that when the output character string intended by the operator is decomposed into a plurality of words by the decomposition means, the input kana character string is sequentially displayed from the first word to Obere, -Yu, and so on. This is achieved by a kana-kanji conversion processing method characterized by determining the intended conversion character string.

以下、本発明の実施例を図面に基づいて詳細に説明する
。                     (1第
1図は本発明の一実施例であるカナ漢字変換処理のプル
ツク図である。図において、lは入力前処理部、2は単
語抽出部、3は同音語判別部、4は出力制御部そして5
は変換制御部である。
Embodiments of the present invention will be described in detail below with reference to the drawings. (1) Figure 1 is a pull diagram of the kana-kanji conversion process which is an embodiment of the present invention. control section and 5
is a conversion control unit.

日本語文が仮名文で入力されると、以下の如き処理を経
て漢字カナ混じり文として出力される。
When a Japanese sentence is input as a kana sentence, it is output as a mixed kanji/kana sentence through the following processing.

入力前処理部1は、入力仮名文中の英数字、文節区切り
情報等をwg11iIlシて変換対象となる仮名文字列
を抽出し、変換制御部δの制御の下に単語抽出部2に、
変換単位となる仮名文字列を渡す。単語抽出部2は前記
仮名文字列から、カナを見出しとする単語辞書および該
単語辞書に付加されている語の品詞情報、品詞別の接続
情報を納めた辞書等を参照し、前記仮名文字列と辞書見
出しとの一致を試み、文法的に許容される単語列の候補
を抽出する。同音語判別部3は、上記単語列の候補が複
数個存在する場合に、単語の持つ頻度情報等を用いて最
有力単語列を決定する。出力制御部4は、上述の如く決
定された文節の対応文字を出力表示装置に表示する。そ
の後、入力前処理部1が制御キーの次候補を指示する制
御信号を受けると、初めに表示した最有力候補の代りに
次点となった候補を表示する。
The input preprocessing unit 1 extracts a kana character string to be converted by extracting alphanumeric characters, bunsetsu delimiter information, etc. in the input kana sentence, and sends the word extraction unit 2 to the word extraction unit 2 under the control of the conversion control unit δ.
Pass the kana string that is the conversion unit. The word extracting unit 2 extracts the kana character string from the kana character string by referring to a word dictionary with kana as a heading, a dictionary containing word part-of-speech information added to the word dictionary, and connection information for each part of speech. and the dictionary entry to extract grammatically acceptable word string candidates. When there are multiple candidates for the word string, the homophone discrimination unit 3 determines the most likely word string using frequency information of the words. The output control unit 4 displays the corresponding characters of the phrase determined as described above on the output display device. Thereafter, when the input preprocessing section 1 receives a control signal instructing the next candidate for the control key, the second candidate is displayed instead of the most likely candidate that was initially displayed.

以下、本発明の要点である単語抽出部2について説明す
る。
The word extraction section 2, which is the main point of the present invention, will be explained below.

単語抽出部2は第2図に示す如く、文節単位仮名1文字
列記憶部2A、単語抽出制御s2B、候補単語記憶部2
0.辞書検索制御部2Dおよび変換辞書記憶部2Eから
成っている。文節単位仮名文字列記憶部2人は文節単位
文字列信号を受取り、これを単語抽出制御部2Bに送り
、単語抽出制御部2Bはこれを単語に分解し辞書検索制
御部2Dを通して変換辞書記憶部2Eに記憶されている
内容と照合する。この処理により見出された変換候補は
候補単語記憶部2Cに格納される。
As shown in FIG. 2, the word extraction unit 2 includes a clause unit kana 1 character string storage unit 2A, a word extraction control s2B, and a candidate word storage unit 2.
0. It consists of a dictionary search control section 2D and a conversion dictionary storage section 2E. Clause-based kana character string storage unit The two receive the clause-based character string signal and send it to the word extraction control unit 2B, which breaks it down into words and sends it to the conversion dictionary storage unit through the dictionary search control unit 2D. Check with the contents stored in 2E. The conversion candidates found through this process are stored in the candidate word storage section 2C.

単語抽出部2の上記処理は5、第3図に示したステップ
(段階)の順に行われる。すなわち、文節単位の仮名文
字列から、段階1では最長一致法により自立語を抽出し
、これを基に■〜■の如く単語の組合わせを決定する。
The above processing of the word extracting section 2 is performed in the order of 5 and the steps shown in FIG. That is, in step 1, independent words are extracted from the kana character string of each clause by the longest match method, and word combinations such as ① to ③ are determined based on this.

段階lで変換候補に当る組合わせが見出されなかった場
合、あるいは、段階1で決定した変換候補にオペレータ
の意図する変換文字列がない場合には段階2へ移る。段
階2では、オペレータに段階2に入ったことを知らせ、
Q〜■の如く、「第1単語(例えば0では「自立語1」
を指す。)十残り仮名文字列」の形態で仮名文字列の先
頭部の単語の組合わせ候補を表示する。ここで、第1嚇
語の変換候補がオペレータの意図しない文字列のとき、
オペレータは制御キーにより段階1の場合と同様に次候
補を表示させることができる。第1単語がオペレータの
意図したものである場合には、オペレータは別の制御キ
ーにより段階3に進むよう指令を発する。段階2で、第
1単飴に当る候補が見出せない場合は、単語抽出部2の
処理は終了する。
If no combination matching the conversion candidates is found in step 1, or if the conversion candidates determined in step 1 do not include the converted character string intended by the operator, the process moves to step 2. In stage 2, the operator is notified that stage 2 has been entered;
As in Q~■, "first word (for example, 0 means "independent word 1")
refers to ) Displays possible combinations of words at the beginning of the kana character string in the form of "10 remaining kana character strings". Here, when the conversion candidate for the first threatening word is a character string that is not intended by the operator,
The operator can use the control key to display the next candidate as in stage 1. If the first word is what the operator intended, the operator issues a command to proceed to step 3 with another control key. In step 2, if no candidate corresponding to the first single candy is found, the processing of the word extraction unit 2 ends.

段階3では、第3図の段階2の0〜■における第1単語
を除いた残りの仮名文字列について、段階1と同様に単
語の組合わせの比較を行ってオペレータに選択させる(
自立語2個までの選択)。
In step 3, the remaining kana character strings excluding the first word in step 2 of FIG.
(Select up to 2 independent words).

これで、まだ、仮名文字列が残る場合には、単語抽出部
2の処理を終了するか、または残りの仮名文字列に対し
て段階3の操作を綽り返ず。
If there are still kana character strings remaining, the process of the word extraction unit 2 is terminated, or the operation in step 3 is not repeated for the remaining kana character strings.

なお、第3図において0”は任意の繰り返しを、また、
〔〕は有っても無くても良いことを示すものであり、矢
印は単語の組合わせ比較の順序を示すものである。
In addition, in Fig. 3, 0'' indicates arbitrary repetition, and
[ ] indicates that it may be present or absent, and arrows indicate the order of word combination comparison.

上記実施例において、段F!/2以降ではすでに変換の
終了した部分と未だ変換の終了していない部分との区別
を容易にするために表示方法を変更しでも良い。
In the above embodiment, stage F! After /2, the display method may be changed to make it easier to distinguish between the portions that have already been converted and the portions that have not yet been converted.

以上述べた如く、本発明によれば、入力文を、自立語を
中心とする分かち書き単位に分解して仮名文字で入力し
、これに対応する漢字カナ混じり文を逐次得るカナ漢字
変換処理方式において、文節単位の入力仮名文字列を単
語に分解する手段と、分解された各単語に対する変換候
補を前記入力仮名文字列の先頭の単語から順に表示する
手段と、変換候補がオペレータの意図しない文字列であ
るとき次点となる候補を表示する手段とを設けて、オペ
レータの意図する出力文字列が前記分解手段    1
によって複数の単語に分解された場合に、前記入力仮名
文字列の先頭の単語から順にオペレータが意図する変換
文字列を決定して行くようにしたので、文節分かち書き
単位として複数の自立語を許容しながら、同音語の組合
わせ数を減らすことができ、オペレータの負担を軽減し
たカナ漢字変換処理方式を実現できるという顕著な効果
を秦する。
As described above, according to the present invention, in the kana-kanji conversion processing method, an input sentence is broken down into division units centered on independent words and input as kana characters, and corresponding sentences containing kanji and kana are sequentially obtained. , a means for decomposing an input kana character string in units of phrases into words, a means for displaying conversion candidates for each decomposed word in order from the first word of the input kana character string, and a means for displaying conversion candidates for character strings not intended by the operator. means for displaying the runner-up candidate when 1 is the output character string intended by the operator.
When the input kana character string is decomposed into multiple words by However, it has the remarkable effect of reducing the number of homophone combinations and realizing a kana-kanji conversion processing method that reduces the burden on the operator.

【図面の簡単な説明】[Brief explanation of the drawing]

第1図は本発明の一実施例であるカナ漢字変換処理のブ
ロック図、第2図は単語抽出部の詳細を示す図、第3図
は処理のステップを示す図である。 1:入力前処理部、2:単語抽出部、・2A1文節単位
仮名文字列記憶部、2B=単語抽出制御部、2C;候補
単語記憶部、2D=辞書検索記憶部、2E:変換辞書記
憶部、3=同音語判別部、4=出力制御部、5:変換制
御部。 特許出願人  株式会社 リ コ 一 代 理 人  弁理士 磯 村 雅 俊□135
FIG. 1 is a block diagram of a kana-kanji conversion process according to an embodiment of the present invention, FIG. 2 is a diagram showing details of a word extraction section, and FIG. 3 is a diagram showing steps of the process. 1: Input preprocessing unit, 2: Word extraction unit, 2A1 Clause unit kana character string storage unit, 2B = Word extraction control unit, 2C: Candidate word storage unit, 2D = Dictionary search storage unit, 2E: Conversion dictionary storage unit , 3 = homophone discrimination section, 4 = output control section, 5: conversion control section. Patent applicant: Ricoh Co., Ltd. Patent attorney: Masatoshi Isomura □135

Claims (1)

【特許請求の範囲】[Claims] 入力文を、自立語を中心とする分かち書き単位に分解し
て仮名文字で入力し、これに対応する漢字カナ混じり文
を逐次得るカナ漢字変換処理方式において、文節単位の
入力仮名文字列を単語に分解する手段と、分解された各
単語に対する変換候補を前記入力仮名文字列の先頭の単
語から順に表示する手段と、変換候補がオペレータの意
図しない文字列であるとき次点となる候補を表示する手
段とを設けて、オペレータの意図する出力文字列が前記
分解手段によって複数の単語に分解された場合に、前記
入力仮名文字列の先頭の単語から順にオペレータが意図
する変換文字列を決定して行くことを特徴とするカナ漢
字変換処理方式。
In the kana-kanji conversion processing method, the input sentence is broken down into division units centered on independent words and input as kana characters, and the corresponding sentences containing kanji and kana are sequentially obtained.The input kana character string in phrase units is converted into words. means for decomposing, means for displaying conversion candidates for each decomposed word in order from the first word of the input kana character string, and displaying a runner-up candidate when the conversion candidate is a character string not intended by the operator. and means for determining a converted character string intended by the operator in order from the first word of the input kana character string when the output character string intended by the operator is decomposed into a plurality of words by the decomposition means. A kana-kanji conversion processing method that is characterized by the ability to go.
JP57109395A 1982-06-25 1982-06-25 Kana (japanese syllabary)-kanji (chinese character) converting system Pending JPS58225428A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP57109395A JPS58225428A (en) 1982-06-25 1982-06-25 Kana (japanese syllabary)-kanji (chinese character) converting system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP57109395A JPS58225428A (en) 1982-06-25 1982-06-25 Kana (japanese syllabary)-kanji (chinese character) converting system

Publications (1)

Publication Number Publication Date
JPS58225428A true JPS58225428A (en) 1983-12-27

Family

ID=14509148

Family Applications (1)

Application Number Title Priority Date Filing Date
JP57109395A Pending JPS58225428A (en) 1982-06-25 1982-06-25 Kana (japanese syllabary)-kanji (chinese character) converting system

Country Status (1)

Country Link
JP (1) JPS58225428A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS60225973A (en) * 1984-04-25 1985-11-11 Seiko Epson Corp Kana-to-kanji converting device

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS60225973A (en) * 1984-04-25 1985-11-11 Seiko Epson Corp Kana-to-kanji converting device
JPH0326868B2 (en) * 1984-04-25 1991-04-12 Seiko Epson Corp

Similar Documents

Publication Publication Date Title
US4777600A (en) Phonetic data-to-kanji character converter with a syntax analyzer to alter priority order of displayed kanji homonyms
JP2515726B2 (en) Information retrieval method and device
US5079701A (en) System for registering new words by using linguistically comparable reference words
JPS58225428A (en) Kana (japanese syllabary)-kanji (chinese character) converting system
JPS61248160A (en) Document information registering system
JPS5680770A (en) "kanji" (chinese character) input device for print
JPS58159133A (en) Kana (japanese syllabary)-to-kanji (chinese character) conversion processing system
JPS58182741A (en) Kana (japanese syllabary)-kanji (chinese character) conversion processor
JPS60129874A (en) Japanese word input device
JPS5887656A (en) Kana(japanese syllabary) and kanji(chinese character) conversion processing system
JPS5927338A (en) "kana" (japanese syllabary) and "kanji" (chinese character) conversion and processing system
JPS5932031A (en) Processor of japanese word information
JPS58159134A (en) Kana (japanese syllabary)-to-kanji (chinese character) conversion processing system
JPS59121425A (en) Chinese phonetic alphabet of kanji converter
JPH027159A (en) Japanese processor
JP3137329B2 (en) Document editing device
JPS5887618A (en) Kana (japanese syllabary)-kanji(chinese character) conversion processing system
JPS5887657A (en) Kana(japanese syllabary) kanji(chinese character) conversion processing system
JPS61223977A (en) Translation processor
JPH05165805A (en) Japanese syllabary/chinese character converter
JPS62271058A (en) Mechanical translation system
JPS63156275A (en) Automatic kana and katakana name dictionary adding system
JPS6072014A (en) "kana"-"kanji" converting device
JPS63116269A (en) Kana/kanji converter for japanese processing
JPH03176758A (en) Kana / kanji converter