JPH05135017A

JPH05135017A - Method for processing character and device therefor

Info

Publication number: JPH05135017A
Application number: JP3295665A
Authority: JP
Inventors: Yuji Kobayashi; 雄二小林; Eiichiro Toshima; 英一朗戸島; Hironori Suzuki; 大記鈴木; Kazuyo Ikeda; 和世池田
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 1991-11-12
Filing date: 1991-11-12
Publication date: 1993-06-01

Abstract

PURPOSE:To select a desired description from the next candidate of KANA (Japanese syllabary)/KANJI (Chinese character) conversion and to enable KANA /KANJI conversion with high operability by registering plural unknown words with the same reading on an unknown word learning dictionary. CONSTITUTION:As a microprocessor, a CPU executes calculation and logical judgement, etc., KNAJI for processing characters. When an input is supplied from a keyboard KB, first of all, an interrupt signal is sent to the microprocessor CPU, the microprocessor CPU reads the various control signals stored in a ROM and according to those control signals, the various kinds of control are executed. The unknown word learning dictionary is composed of a header including the management numbers, etc., of the dictionary, main body as the set of unknown words, and management table for managing the unknown words, and stores reading, description and part of speech concerning each registered unknown word. Then, the unknown word learning means executes additional registration even when the unknown word with the same reading as that of the unknown word to be newly registered is already registered on the unknown word learning dictionary.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は日本語の処理に関し、特
にかな漢字変換において、正しい文章を作成する処理装
置に関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to Japanese language processing, and more particularly to a processing apparatus for creating a correct sentence in Kana-Kanji conversion.

【０００２】[0002]

【従来の技術】従来、日本語の入力においては、単語の
読み、表記、品詞を格納した単語辞書を用いてかな漢字
変換を実行している。かな漢字変換の結果には、一般に
同音語と呼ばれる、同じ読みを持ち表記が異なる単語が
複数存在することがある。そして、同音語のうちユーザ
が選択した単語が学習として記憶され、次回のかな漢字
変換時に同音語の中で一意に決定される学習処理が行な
われてきた。また、単語辞書に登録されていない未知語
に対しては、新たに未知語登録することにより、学習が
行なわれている。2. Description of the Related Art Conventionally, in inputting Japanese, kana-kanji conversion is performed using a word dictionary that stores word readings, notations, and parts of speech. As a result of kana-kanji conversion, there may be a plurality of words, which are generally called homophones, having the same pronunciation but different notations. Then, a word selected by the user among the homonyms is stored as learning, and a learning process is performed that is uniquely determined in the homonyms at the time of the next kana-kanji conversion. For unknown words that are not registered in the word dictionary, learning is performed by newly registering unknown words.

【０００３】[0003]

【発明が解決しようとしている課題】しかしながら、従
来の学習処理では単語を一意に決定するために、同じ読
みを持つ単語のうち高々１種類の単語のみを学習するた
め、未知語のように辞書に登録されていない単語に対し
て、同じ読みで何種類かの表記を併用したい場合、学習
されていない表記をそのつど何らかの変換操作で作り出
すか、あるいはユーザの単語登録操作により、登録しな
ければならず、入力作業を中断させられてしまってい
た。However, in the conventional learning process, in order to uniquely determine a word, only at most one kind of words having the same reading is learned. If you want to use several types of notation with the same reading for an unregistered word, you must create an unlearned notation by some conversion operation each time, or register it by the user's word registration operation. Instead, the input work was interrupted.

【０００４】[0004]

【問題点を解決するための手段】本発明は上記従来技術
の欠点を除去するものであり、問題点を解決するため
に、少なくとも未知語の読み、表記を学習する未知語学
習手段と、未知語を登録する学習辞書とを有し、未知語
学習手段は、新たに登録する未知語の読みと同じ読みの
未知語が、学習辞書に登録済であっても追加登録するこ
とにより、同じ読みの未知語の表記を複数持たせること
ができ、ユーザーの目的にあったかな漢字変換装置を築
きあげることができる。The present invention eliminates the above-mentioned drawbacks of the prior art. In order to solve the problems, at least unknown word learning means for learning reading and notation of unknown words and unknown words are known. The unknown word learning means has a learning dictionary for registering words, and the unknown word having the same reading as the newly registered unknown word is added to the learning dictionary even if the unknown word is already registered in the learning dictionary. You can have multiple notation of unknown words, and you can build a kana-kanji converter that suits your purpose.

【０００５】[0005]

【実施例】以下、図面を参照して本発明を詳細に説明す
る。DESCRIPTION OF THE PREFERRED EMBODIMENTS The present invention will be described in detail below with reference to the drawings.

【０００６】図１は本発明の全体構成の一例である。FIG. 1 is an example of the overall configuration of the present invention.

【０００７】図示の構成において、ＣＰＵは、マイクロ
プロセッサであり、文字処理のための演算、論理判断等
を行ない、アドレスバスＡＢ、コントロールバスＣＢ、
データバスＤＢを介して、それらのバスに接続された各
構成要素を制御する。In the configuration shown in the figure, the CPU is a microprocessor, which performs arithmetic operations for character processing, logical judgments, etc., an address bus AB, a control bus CB,
The respective components connected to those buses are controlled via the data bus DB.

【０００８】アドレスバスＡＢはマイクロプロセッサＣ
ＰＵの制御の対象とする構成要素を指示するアドレス信
号を転送する。コントロールバスＣＢはマイクロプロセ
ッサＣＰＵの制御の対象とする各構成要素のコントロー
ル信号を転送して印加する。データバスＤＢは各構成機
器相互間のデータの転送を行なう。The address bus AB is a microprocessor C
An address signal for instructing a component to be controlled by the PU is transferred. The control bus CB transfers and applies a control signal of each constituent element to be controlled by the microprocessor CPU. The data bus DB transfers data between the constituent devices.

【０００９】次にＲＯＭは、読出し専用の固定メモリで
ある。ＰＡは、図６〜図１３につき後述するマイクロプ
ロセッサＣＰＵによる制御手順等を記憶させたプログラ
ムエリアである。The ROM is a read-only fixed memory. PA is a program area in which control procedures by the microprocessor CPU, which will be described later with reference to FIGS. 6 to 13, are stored.

【００１０】また、ＲＡＭは、１ワード１６ビットの構
成の書込み可能のランダムアクセスメモリであって、各
構成要素からの各種データの一時記憶に用いる。The RAM is a writable random access memory having a structure of 1 word and 16 bits, and is used for temporary storage of various data from each constituent element.

【００１１】ＴＢＵＦは文書バッファであり、キーボー
ドＫＢより入力された文書情報を蓄えるためのメモリで
ある。The TBUF is a document buffer, which is a memory for storing the document information input from the keyboard KB.

【００１２】ＹＢＵＦはキーボードＫＢより入力された
読みを格納する入力読みバッファ・メモリである。YBUF is an input reading buffer memory for storing readings input from the keyboard KB.

【００１３】ＤＩＣはカナ漢字変換を行なうための単語
辞書である。The DIC is a word dictionary for performing Kana-Kanji conversion.

【００１４】ＴＤＩＣは辞書に登録されていない未知語
を学習するために、未知語を登録することができる未知
語辞書である。The TDIC is an unknown word dictionary in which unknown words can be registered in order to learn unknown words that are not registered in the dictionary.

【００１５】ＧＤＩＣはカナ漢字変換の最新使用学習を
格納する学習データ辞書である。The GDIC is a learning data dictionary that stores the latest learning of kana-kanji conversion.

【００１６】ＤＢＰＯＯＬはカナ漢字変換の候補を蓄え
る同音語プールである。DBPOOL is a homophone pool that stores candidates for Kana-Kanji conversion.

【００１７】ＬＲＮＤＡＴは個々の単語および用例の学
習状態を格納した学習データ格納メモリである。LRNDAT is a learning data storage memory that stores learning states of individual words and examples.

【００１８】ＫＢはキーボードであって、アルファベッ
トキー、ひらがなキー、カタカナキー等の文字記号入力
キー、及び、変換を指示する変換キーなどの各種のファ
ンクションキーを備えている。KB is a keyboard, which is provided with various function keys such as alphabetic keys, hiragana keys, katakana keys and other character and symbol input keys, and conversion keys for instructing conversion.

【００１９】図１において、ＹＯＭＩは読みを入力する
ためのキー、ＣＯＮは入力した読みを変換するための変
換指示キー、ＮＸＴは変換候補を変更して次候補に変換
するための次候補変換指示キー、ＳＥＬは現在の同音語
表示候補に確定し同時にその候補表記を学習することを
指示するための選択キー、ＨＩＲＡは入力した読みをひ
らがな表記で確定するひらがな変換キー、ＫＡＴＡは入
力した読みをカタカナ表記で確定するカタカナ変換キ
ー、ＴＡＮは入力した読みを単漢字レベルで変換するた
めの単漢字変換指示キーである。In FIG. 1, YOMI is a key for inputting a reading, CON is a conversion instruction key for converting the input reading, NXT is a next candidate conversion instruction for changing a conversion candidate and converting it to a next candidate. Key, SEL is a selection key for confirming the current homophone display candidate and learning the candidate notation at the same time, HIRA is a hiragana conversion key for confirming the input reading in hiragana notation, and KATA is the input reading. A katakana conversion key that is determined by katakana notation, TAN is a single-kanji conversion instruction key for converting the input reading at the single-kanji level.

【００２０】ＤＩＳＫは定型文書を記憶するためのメモ
リで作成された文書の保管を行ない、保管された文書は
キーボードの指示により、必要な時呼び出される。DISK stores a document created in a memory for storing a fixed form document, and the stored document is called when necessary by a keyboard instruction.

【００２１】ＣＲはカーソルレジスタである。ＣＰＵに
より、カーソルレジスタの内容を読み書きできる。後述
するＣＲＴコントローラＣＲＴＣは、ここに蓄えられた
アドレスに対する表示装置ＣＲＴ上の位置にカーソルを
表示する。CR is a cursor register. The CPU can read and write the contents of the cursor register. The CRT controller CRTC, which will be described later, displays a cursor at a position on the display device CRT for the address stored here.

【００２２】ＤＢＵＦは表示用バッファメモリで、ＴＢ
ＵＦに蓄えられた文書情報等のパターンを蓄える。DBUF is a display buffer memory, and TB
The pattern such as document information stored in the UF is stored.

【００２３】ＣＲＴＣはカーソルレジスタＣＲ及びバッ
ファＤＢＵＦに蓄えられた内容を表示器ＣＲＴに表示す
る役割を担う。The CRTC plays a role of displaying the contents stored in the cursor register CR and the buffer DBUF on the display CRT.

【００２４】また、ＣＲＴは陰極線管等を用いた表示装
置であり、その表示装置ＣＲＴにおけるドット構成のパ
ターンおよびカーソルの表示をＣＲＴコントローラで制
御する。Further, the CRT is a display device using a cathode ray tube or the like, and the display of the dot configuration pattern and the cursor on the display device CRT is controlled by the CRT controller.

【００２５】さらに、ＣＧはキャラクタジェネレータで
あって、表示装置ＣＲＴに表示する文字、記号のパター
ンを記憶するものである。Further, CG is a character generator, which stores patterns of characters and symbols to be displayed on the display device CRT.

【００２６】かかる各構成要素からなる本発明文字処理
装置においては、キーボードＫＢからの各種の入力に応
じて作動するものであって、キーボードＫＢからの入力
が供給されると、まず、インタラプト信号がマイクロプ
ロセッサＣＰＵに送られ、そのマイクロプロセッサＣＰ
ＵがＲＯＭ内に記憶してある各種の制御信号を読出し、
それらの制御信号に従って、各種の制御が行なわれる。The character processing device of the present invention comprising the above-described components operates in response to various inputs from the keyboard KB, and when an input from the keyboard KB is supplied, an interrupt signal is first sent. Sent to the microprocessor CPU, which microprocessor CP
U reads out various control signals stored in the ROM,
Various controls are performed in accordance with those control signals.

【００２７】上記の構成よりなる本実施例装置における
未知語学習が実行される例を図２を参照して以下に説明
する。An example in which unknown word learning is executed in the apparatus of the present embodiment having the above-mentioned configuration will be described below with reference to FIG.

【００２８】（ａ）は文字が入力される前のＣＲＴの状
態である。図中において（ｑ）はカーソルを示す。(A) is a state of the CRT before characters are input. In the figure, (q) indicates a cursor.

【００２９】この状態で、「き」，「か」，「い」とい
う文字が入力されると、ＣＲＴの状態は（ｂ）のように
なる。（ｒ）の示すアンダーラインはかな漢字変換結果
が決定されていない状態を示す。In this state, when the characters "ki", "ka", and "i" are input, the state of the CRT becomes as shown in (b). The underline indicated by (r) indicates a state where the kana-kanji conversion result has not been determined.

【００３０】（ｂ）の状態から、変換キーが入力される
と、入力文字列に対してかな漢字変換が実行され、その
結果ＣＲＴの状態は（ｃ）のようになる。引き続いて変
換キーが入力されると、画面下モニター部（ｓ）に次候
補表記を表示し、対応数字キーを入力することにより候
補の選択がなされる。かな漢字変換結果が確定されてい
ない状態でカタカナ変換キーが入力されると、読みに対
応したカタカナ表記に変換され、（ｅ）のようになり、
選択キーが入力されると読み「きかい」、表記「キカ
イ」なる未知語学習が行なわれ、さらに「キカイ」で表
記を確定し、その結果ＣＲＴの状態は（ｆ）のようにな
る。この状態から再度、「きかい」と入力し、変換する
と（ｇ）（ｈ）（ｉ）のごとく未知語学習された「キカ
イ」と変換され、（ｊ）のように次候補の表記中にも登
録された未知語「キカイ」が表われる。さらにカタカナ
変換キーを入力すると、（ｋ）のごとく変換されたカタ
カナの末尾の文字から１文字ずつひらがなに変換され
る。この状態で選択キーが入力されると、読み「きか
い」、表記「キカい」の未知語が「キカイ」に加えて追
加登録される。さらに、同じように「き」，「か」，
「い」という文字が入力され、さらに変換キーが入力さ
れると、未知語学習により「キカい」が変換され、その
結果ＣＲＴの状態は（ｏ）のようになり、次候補表記中
には「キカい」と「キカイ」の二つの未知語が共存して
いる。When the conversion key is input from the state of (b), kana-kanji conversion is executed on the input character string, and as a result, the state of the CRT becomes as shown in (c). When the conversion key is subsequently inputted, the next candidate notation is displayed on the monitor section (s) at the bottom of the screen, and the corresponding numerical key is inputted to select the candidate. If the Katakana conversion key is entered while the Kana-Kanji conversion result has not been finalized, it will be converted to Katakana notation that corresponds to reading, as shown in (e),
When the selection key is input, unknown words such as reading "kikai" and notation "kikai" are learned, and the notation is fixed by "kikai". As a result, the state of the CRT becomes as shown in (f). In this state, input "Kikai" again, and if converted, it will be converted into "Kikai" with unknown words learned as in (g), (h), and (i), and in the notation of the next candidate as in (j). The registered unknown word "Kikai" appears. When the katakana conversion key is further input, the characters at the end of the katakana converted as shown in (k) are converted into hiragana characters one by one. When the selection key is input in this state, the unknown word of the reading "kikai" and the notation "kikai" is additionally registered in addition to "kikai". Furthermore, in the same way, "ki", "ka",
When the character "i" is input and the conversion key is further input, "quick" is converted by unknown word learning, and as a result, the state of the CRT becomes like (o), and in the next candidate notation Two unknown words coexist, “quick” and “kikai”.

【００３１】図４は、未知語学習辞書ＴＤＩＣの構成を
示した図である。FIG. 4 is a diagram showing the structure of the unknown word learning dictionary TDIC.

【００３２】未知語学習辞書は、辞書の管理番号等を含
むヘッダと、未知語の集合である本体と、未知語を管理
するための管理テーブルから成る。各登録未知語に対し
ては、読み、表記、品詞が格納されている。図中の例
は、読み「きかい」、表記「キカイ」、品詞「名詞」で
ある未知語である。The unknown word learning dictionary is composed of a header including a management number of the dictionary, a main body which is a set of unknown words, and a management table for managing the unknown words. Reading, notation, and part of speech are stored for each registered unknown word. The example in the figure is an unknown word whose reading is "kikai", notation "kikai", and part of speech "noun".

【００３３】図５は、管理テーブルの構成を示した図で
ある。未知語辞書の中のアドレスを格納している。テー
ブルの中の位置によって新旧管理を行っている。FIG. 5 is a diagram showing the structure of the management table. Stores the address in the unknown word dictionary. Old and new management is performed depending on the position in the table.

【００３４】上述の実施例の作動をフローに従って説明
する。The operation of the above embodiment will be described according to the flow.

【００３５】図６は本発明文字処理装置の動作を示すフ
ローチャートである。FIG. 6 is a flow chart showing the operation of the character processing apparatus of the present invention.

【００３６】Ｓ６−１においてキーボードよりキーが押
下され、割り込みが発生するのを待つ。キーが入力され
るとＳ６−２においてキー判別し、キーの種類に応じて
Ｓ６−３、Ｓ６−４、Ｓ６−５、Ｓ６−６、Ｓ６−７、
Ｓ６−８、Ｓ６−９、Ｓ６−１０、Ｓ６−１１、Ｓ６−
１２のいずれかのステップに分岐する。In step S6-1, a key is pressed on the keyboard and an interrupt is awaited. When a key is input, the key is discriminated in S6-2, and depending on the type of key, S6-3, S6-4, S6-5, S6-6, S6-7,
S6-8, S6-9, S6-10, S6-11, S6-
It branches to any of 12 steps.

【００３７】Ｓ６−３は読み入力キーＹＯＭＩが押下さ
れたときの処理であり、押下された読みのコードを入力
読みバッファ・メモリＹＢＵＦに蓄える。Step S6-3 is a process when the reading input key YOMI is pressed, and the code of the pressed reading is stored in the input reading buffer memory YBUF.

【００３８】Ｓ６−４は変換キーＣＯＮが押下されたと
きの処理であり、Ｓ６−３で入力されてＹＢＵＦに蓄え
られている、カナ漢字変換の対象となる文字列を漢字に
変換し、出力バッファに出力する。漢字に変換する際
に、学習データ辞書を参照して、同音語の第１候補を決
定する。S6-4 is a process when the conversion key CON is pressed, and the character string which is input in S6-3 and stored in YBUF and which is the target of kana-kanji conversion is converted into kanji and output. Output to buffer. When converting to kanji, the learning data dictionary is referenced to determine the first candidate of the homophone.

【００３９】Ｓ６−５は単漢字変換キーＴＡＮが押下さ
れたときの処理であり、Ｓ６−３で入力されてＹＢＵＦ
に蓄えられている、かな漢字変換の対象となる文字列を
単漢字レベルで変換し、出力バッファに出力する。Step S6-5 is the processing when the single-kanji conversion key TAN is pressed, which is input in step S6-3 and YBUF is entered.
The character string to be converted into kana-kanji stored in is converted at the single-kanji level and output to the output buffer.

【００４０】Ｓ６−６はひらがな変換キーＨＩＲＡが押
下されたときの処理であり、Ｓ６−３で入力されてＹＢ
ＵＦに蓄えられている、かな漢字変換の対象となる文字
列をひらがな表記に変換し、出力バッファに出力する。Step S6-6 is the processing when the hiragana conversion key HIRA is pressed, and is input in step S6-3 and YB
The character string to be converted into kana-kanji stored in the UF is converted into hiragana notation and output to the output buffer.

【００４１】Ｓ６−７はカタカナ変換キーＫＡＴＡが押
下されたときの処理であり、Ｓ６−３で入力されてＹＢ
ＵＦに蓄えられている、かな漢字変換の対象となる文字
列をカタカナ表記に変換し、出力バッファに出力する。Step S6-7 is the processing when the Katakana conversion key KATA is pressed, and it is input in S6-3 and YB
The character string to be converted into kana-kanji stored in the UF is converted into katakana notation and output to the output buffer.

【００４２】Ｓ６−８は次候補変換キーＮＸＴが押下さ
れたときの処理であり、Ｓ６−４によって出力された出
力バッファ中の同音語の別の候補を表示する。Step S6-8 is a process when the next candidate conversion key NXT is pressed, and another candidate for the same word in the output buffer output in step S6-4 is displayed.

【００４３】Ｓ６−９は選択キーＳＥＬが押下されたと
きの処理であり、画面に表示されている出力バッファ中
の同音語を確定し、確定された文字列を文書中に出力す
る。さらに、選択された単語を学習する処理を行なう。Step S6-9 is a process when the selection key SEL is pressed, and determines the homophone in the output buffer displayed on the screen and outputs the determined character string in the document. Further, a process of learning the selected word is performed.

【００４４】Ｓ６−１０は登録キーＴＯＲが押下された
ときの処理であり、ユーザーに指定された単語を登録単
語辞書ＴＤＩＣに登録する。S6-10 is a process when the registration key TOR is pressed, and the word designated by the user is registered in the registered word dictionary TDIC.

【００４５】Ｓ６−１１は一覧キーＬＳＴが押下された
ときの処理であり、登録単語辞書ＴＤＩＣ内の登録単語
の一覧表を表示する。Step S6-11 is a process when the list key LST is pressed, and a list of registered words in the registered word dictionary TDIC is displayed.

【００４６】Ｓ６−１２は、ＹＯＭＩ、ＣＯＮ、ＮＸ
Ｔ、ＳＥＬ以外のキー（例えば、カーソル移動キーなど
の文書編集で用いるキーなど）が押下された場合の処理
であり、同種の文字処理装置において一般に行なわれて
いる処理であり、公知であるので特に記述しない。S6-12 is YOMI, CON, NX
This is a process when a key other than T and SEL (for example, a key used for document editing such as a cursor movement key) is pressed, and is a process generally performed in a character processing device of the same type, and is known. No particular description.

【００４７】Ｓ６−１３は上記の処理の結果、変更され
た部分を表示する表示処理である。文書中のデータ１文
字を読んではパターンに展開し、表示バッファに出力す
るという通常広く行なわれている処理である。S6-13 is a display process for displaying the changed part as a result of the above process. This is a widely-used process of reading one character of data in a document, developing it into a pattern, and outputting it to a display buffer.

【００４８】図７は、Ｓ６−４の処理を詳細化したフロ
ーチャートである。FIG. 7 is a detailed flowchart of the process of S6-4.

【００４９】Ｓ７−１は、文節単位に分ち書きされて入
力されたカナ漢字変換の対象となる文字列を解析し、カ
ナ漢字変換の出力の候補を同音語プールに出力する処理
である。分ち書きされた単位に文字列を順々に取り出
し、単語辞書を検索して解析を行ない、文節として認定
される候補のみを同音語プールに出力する処理であっ
て、同種の文字処理装置において一般に行なわれている
処理であり、公知であるので特に記述しない。Step S7-1 is a process of analyzing a character string to be subjected to Kana-Kanji conversion that is input after being segmented into bunsetsu units and outputting candidates for output of Kana-Kanji conversion to the homophone word pool. This is a process for extracting character strings in sequence into punctuated units, searching the word dictionary for analysis, and outputting only candidates that are recognized as syllables to the homophone word pool. Since the processing is generally performed and is well known, it will not be described here.

【００５０】Ｓ７−２はＳ７−１において同音語プール
に出力された解析結果に対して、単語辞書中に格納され
ている用例のパターンが存在するかどうかをチェック
し、用例のパターンが存在すれば、その用例の対象とな
る同音語の候補を優先候補としてピックアップする。In step S7-2, the analysis result output to the homophone word pool in step S7-1 is checked to see if there is an example pattern stored in the word dictionary. For example, the candidate of the homophone that is the target of the example is picked up as a priority candidate.

【００５１】Ｓ７−３はＳ７−２でピックアップされた
優先候補や、単語学習されている候補の中から、カナ漢
字変換の第１候補を決定する。In step S7-3, the first candidate for kana-kanji conversion is determined from the priority candidates picked up in step S7-2 and the candidates for which words have been learned.

【００５２】Ｓ７ー４は、出力バッファに格納されたカ
ナ漢字変換の出力を表示する処理であり、同種の文字処
理装置において一般に行なわれている処理であり、公知
であるので特に記述しない。Step S7-4 is a process for displaying the output of the Kana-Kanji conversion stored in the output buffer, which is a process generally performed in a character processing device of the same kind, and it is well known and will not be described here.

【００５３】図８はＳ６−９の処理を詳細化したフロー
チャートである。FIG. 8 is a detailed flowchart of the processing of S6-9.

【００５４】Ｓ８−１では選択された表記が辞書に存在
する場合、そのままＳ８−３の学習処理を実行する。辞
書に存在しない場合はＳ８−２の未知語登録処理を実行
後、Ｓ８−３にて最新使用学習処理を実行する。Ｓ８−
３の最新使用学習処理は選択された単語を学習し、これ
以後のその単語の変換時の優先度を上げるものである。
学習処理終了後、同音語を確定する確定処理を行う。学
習処理は辞書に存在する単語、あるいは登録された単語
に対応した学習データを更新するものであり、同種の文
字処理装置において一般に行われている処理であり、公
知であるので特に記述しない。In S8-1, if the selected notation exists in the dictionary, the learning process of S8-3 is executed as it is. If it does not exist in the dictionary, the unknown word registration process of S8-2 is executed, and then the latest use learning process is executed in S8-3. S8-
The latest usage learning process 3 is to learn the selected word and increase the priority of the word after conversion.
After the learning process is completed, a confirmation process for confirming a homonym is performed. The learning process is to update the learning data corresponding to the word existing in the dictionary or the registered word, is a process generally performed in the character processing device of the same kind, and is known, and therefore will not be particularly described.

【００５５】図９はＳ８−２の処理を詳細化したフロー
チャートである。FIG. 9 is a detailed flowchart of the process of S8-2.

【００５６】Ｓ９−１では未知語辞書に空き領域が存在
するかを調べ、存在しない場合は最も古い登録済未知語
を削除する処理へ分岐する。Ｓ９−２では登録済未知語
を管理している管理テーブルに空き領域が存在するかを
調べ、存在しない場合は同じように最も古い登録済未知
語を削除する処理へ分岐する。空き領域の存在が確認さ
れたらＳ９−４において未知語登録が実行される。登録
が終了したＳ９−５で管理テーブルを更新する。Ｓ９−
４は一般にユーザの単語登録処理として行なわれている
処理と同種の、辞書に新しく単語を登録する処理であ
り、同種の文字処理装置において一般に行われている処
理であり、公知であるので特に記述しない。In S9-1, it is checked whether or not there is a free space in the unknown word dictionary, and if there is no free space, the process branches to the process of deleting the oldest registered unknown word. In S9-2, it is checked whether or not there is a free area in the management table that manages the registered unknown word, and if there is no free area, the process branches to the processing of deleting the oldest registered unknown word. If it is confirmed that there is a free area, unknown word registration is executed in S9-4. The management table is updated in S9-5 after the registration is completed. S9-
4 is a process of registering a new word in a dictionary, which is the same as the process generally performed as a user's word registration process, and is a process generally performed in a character processing device of the same type, which is well known and is particularly described. do not do.

【００５７】図１０はＳ９−３の処理を詳細化したフロ
ーチャートである。FIG. 10 is a detailed flowchart of the processing of S9-3.

【００５８】Ｓ１０−１は管理テーブルより最も古い登
録済未知語を検索する処理である。Ｓ１０−２は検索さ
れた登録済未知語を未知語辞書から削除する処理であ
る。Ｓ１０−３は削除終了した時点で管理テーブルを更
新する処理である。Step S10-1 is a process for retrieving the oldest registered unknown word from the management table. S10-2 is a process of deleting the registered registered unknown word from the unknown word dictionary. S10-3 is a process of updating the management table when the deletion is completed.

【００５９】図１１はＳ９−５、あるいはＳ１０−３の
処理を詳細化したフローチャートである。FIG. 11 is a detailed flowchart of the processing of S9-5 or S10-3.

【００６０】Ｓ１１−１でカウンター変数ｉに１をセッ
トする。Ｓ１１−２で管理テーブル内でｉ番目の未知語
のアドレスが存在するか調べる。存在した場合Ｓ１１−
３に分岐し、その未知語の未知語辞書内のアドレスを管
理テーブルからゲットする。Ｓ１１−４で現時点の未知
語辞書に合わせて更新する。Ｓ１１−５で管理テーブル
へプットする。Ｓ１１−６でｉを更新させ管理テーブル
内の次の未知語へ処理を移す。In S11-1, the counter variable i is set to 1. In S11-2, it is checked whether or not the address of the i-th unknown word exists in the management table. If present S11-
It branches to 3 and gets the address of the unknown word in the unknown word dictionary from the management table. In S11-4, it is updated in accordance with the current unknown word dictionary. Put in the management table in S11-5. In step S11-6, i is updated and the process is moved to the next unknown word in the management table.

【００６１】（他の実施例）以上の説明において、登録
済未知語の新旧管理を未知語辞書内のアドレスを格納し
た管理テーブルによって行っているが、未知語辞書内に
格納する実際の位置によって新旧管理を行うこともでき
る。また格納する単語そのものに新旧管理情報を持たせ
て管理することも可能である。また一度登録された未知
語が変換で使用される度に管理情報を更新することによ
って使用状況による管理を行うことも可能である。(Other Embodiments) In the above description, the old and new management of the registered unknown words is performed by the management table storing the addresses in the unknown word dictionary, but depending on the actual position stored in the unknown word dictionary. You can also manage old and new. It is also possible to manage the stored words themselves by having old and new management information. Further, it is also possible to manage according to the usage status by updating the management information every time the unknown word once registered is used for conversion.

【００６２】また、本発明は、単体の装置に限らず、複
数の装置からなるシステムにも適用可能であり、更に、
装置またはシステムに、ソフトウェアを提供することに
よっても、実現可能であることは、言うまでもない。The present invention is applicable not only to a single device but also to a system composed of a plurality of devices.
It goes without saying that it can be realized by providing software to the device or system.

【００６３】[0063]

【発明の効果】上述したように、本発明によれば、同じ
読みの未知語を複数、未知語学習辞書内に登録させるこ
とにより、かな漢字変換の次候補から所望の表記を選ぶ
ことが可能になり、操作性の優れたかな漢字変換を実現
することができ、ユーザーの目的にあったかな漢字変換
装置を築くことが可能である。As described above, according to the present invention, it is possible to select a desired notation from the next candidates for kana-kanji conversion by registering a plurality of unknown words having the same reading in the unknown word learning dictionary. It is possible to realize Kana-Kanji conversion with excellent operability, and it is possible to build a Kana-Kanji conversion device that suits the user's purpose.

[Brief description of drawings]

【図１】本実施例の文字処理装置の全体構成を示すブロ
ック図である。FIG. 1 is a block diagram showing the overall configuration of a character processing device of this embodiment.

【図２】かな漢字変換における未知語学習が実行される
例を示した図である。FIG. 2 is a diagram showing an example in which unknown word learning in kana-kanji conversion is executed.

【図３】かな漢字変換における未知語学習が実行される
例を示した図である。FIG. 3 is a diagram showing an example in which unknown word learning in kana-kanji conversion is executed.

【図４】未知語辞書の構成の例を示した図である。FIG. 4 is a diagram showing an example of a configuration of an unknown word dictionary.

【図５】未知語辞書の管理テーブルの構成の例を示した
図である。FIG. 5 is a diagram showing an example of a configuration of a management table of an unknown word dictionary.

【図６】本実施例の動作全体の処理手順の一例を示すフ
ローチャートである。FIG. 6 is a flowchart showing an example of the processing procedure of the entire operation of this embodiment.

【図７】本実施例のかな漢字変換全体の処理手順の一例
を示すフローチャートである。FIG. 7 is a flowchart showing an example of the processing procedure of the entire Kana-Kanji conversion according to the present embodiment.

【図８】本実施例の選択処理の処理手順の一例を示すフ
ローチャートである。FIG. 8 is a flowchart illustrating an example of a processing procedure of selection processing according to the present exemplary embodiment.

【図９】本実施例の未知語登録処理の処理手順の一例を
示すフローチャートである。FIG. 9 is a flowchart showing an example of a processing procedure of unknown word registration processing of the present embodiment.

【図１０】本実施例の最古未知語削除処理の処理手順の
一例を示すフローチャートである。FIG. 10 is a flowchart showing an example of a processing procedure of oldest unknown word deletion processing of the present embodiment.

【図１１】本実施例の管理テーブル更新処理の処理手順
の一例を示すフローチャートである。FIG. 11 is a flowchart illustrating an example of a processing procedure of management table update processing according to the present embodiment.

[Explanation of symbols]

ＣＰＵマイクロプロセッサＤＩＣ仮名漢字変換用辞書ＲＯＭ読出し専用メモリＲＡＭランダムアクセスメモリＴＢＵＦ文書バッファＹＢＵＦ入力読みバッファＤＩＣ単語辞書ＴＤＩＣ登録単語辞書ＧＤＩＣ学習データ辞書ＤＢＰＯＯＬ同音語プールＬＲＮＤＡＴ学習データ格納メモリＤＩＳＫ外部記憶ＰＲＴ印字装置ＫＢキーボードＣＲカーソルレジスタＤＢＵＦ表示用バッファメモリＣＲＴＣＣＲＴコントローラＣＲＴ表示装置ＣＧキャラクタジェネレータ CPU Microprocessor DIC Kana-Kanji conversion dictionary ROM Read-only memory RAM Random access memory TBUF Document buffer YBUF Input reading buffer DIC word dictionary TDIC registered word dictionary GDIC learning data dictionary DBPOOL homophone pool LRNDAT learning data storage memory DISK external storage PRT printing device KB Keyboard CR Cursor register DBUF Display buffer memory CRTC CRT controller CRT display CG Character generator

───────────────────────────────────────────────────── フロントページの続き (72)発明者池田和世東京都大田区下丸子３丁目30番２号キヤノン株式会社内 ─────────────────────────────────────────────────── ─── Continuation of the front page (72) Inventor Kazuyo Ikeda 3-30-2 Shimomaruko, Ota-ku, Tokyo Canon Inc.

Claims

[Claims]

1. For reading a word, at least for the word dictionary means that stores information of the word including a notation character string and an unknown word that is not registered in the word dictionary means, It has an unknown word learning means for learning notation and a learning dictionary means for storing learning information by the unknown word learning means, and the unknown word learning means has the same unknown reading as the newly read unknown word. A character processing device characterized in that a word is additionally registered even if the word is already registered in the learning dictionary means.

2. A character processing apparatus comprising a word dictionary storing information of a word including a written character string for reading a word, at least for an unknown word not registered in the word dictionary means. Character processing characterized by learning and storing the reading and notation of an unknown word, and additionally registering an unknown word with the same reading as the newly registered reading of the unknown word even if it has already been registered Method.