JPH0721177A

JPH0721177A - Character input device

Info

Publication number: JPH0721177A
Application number: JP5148746A
Authority: JP
Inventors: Osamu Nakamura; 修中村; Masami Oguro; 雅己小黒; Tadashi Kitamura; 正北村
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 1993-06-21
Filing date: 1993-06-21
Publication date: 1995-01-24

Abstract

PURPOSE:To smoothly input a KANJI (Chinese character) notation word string by automating the selection of a homonym by applying character recognition technology when there is a handwritten or type-printed document to be inputted. CONSTITUTION:A character recognizing means 101 convert entry character string image data into >=1 group of character codes and a KANA (Japanese syllabary)-KANJI (Chinese character) converting means 102 converts a character code string of HIRAGANA (cursive form of KANA), etc., inputted corresponding to them into KANJI notation word string candidates. A collating means 103 performs collation to decide the order of the KANJI notation word string candidates according to the degree of matching with character code candidates, and an input/output control means 104 displays the character image data and the KANJI notation word string whose order is decided and a selects and a determine a KANJI notation word string according to an external indication.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は，印刷または手書きされ
た文字列を計算機システム等へ入力する装置に関し，特
に，平仮名，片仮名またはローマ字によって入力された
文字コード列を，手書きされた文字形状に対応する漢字
表記の文字列に変換して入力する文字入力装置に関す
る。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a device for inputting a printed or handwritten character string to a computer system or the like, and more particularly to converting a character code string input by hiragana, katakana or romaji into a handwritten character shape. The present invention relates to a character input device for converting and inputting into a corresponding kanji character string.

【０００２】[0002]

【従来の技術】ワードプロセッサ，日本語フロントエン
ドプログラム（ＦＥＰ）等の普及により，計算機システ
ム等へ日本語文字列を入力することが一般化している。
これらの日本語入力手段では，漢字表記の日本語入力の
ために，キーボードから入力された平仮名または片仮名
の文字列を漢字表記の単語列へ変換する，いわゆる仮名
漢字変換機能が必須である。この仮名漢字変換機能によ
り，和文タイプのような多くのキーが不要となり，漢字
表記単語列の入力が比較的容易に行えるようになった。
この仮名漢字変換では，平仮名や片仮名で入力された単
語の読みによって単語辞書を検索し，その結果をディス
プレイに表示することが主たる機能となっている。ま
た，最近の仮名漢字変換機能では，入力者が選択した漢
字表記単語を学習する機能も一般化している。2. Description of the Related Art With the spread of word processors, Japanese front-end programs (FEP), etc., it has become common to input Japanese character strings into computer systems and the like.
In order to input Japanese characters in Kanji, these Japanese input means must have a so-called Kana-Kanji conversion function that converts a character string in Hiragana or Katakana input from a keyboard into a word string in Kanji notation. This kana-kanji conversion function eliminates the need for many keys, such as the Japanese type, and makes it easier to enter kanji words.
In this kana-kanji conversion, the main function is to search the word dictionary by reading the words entered in hiragana or katakana and display the results on the display. In addition, in the recent kana-kanji conversion function, the function of learning the kanji notation word selected by the input person has become popular.

【０００３】[0003]

【発明が解決しようとする課題】仮名漢字変換機能を用
いて，単語の読みから漢字表記の単語列を入力する場
合，同音異義語をいかに選択するかが入力効率を左右す
る。従来の仮名漢字変換機能では，入力者が複数の同音
異義語から所望の単語を選択する方法が基本となってお
り，この同音異義語の選択に要する時間が，平仮名や片
仮名の読みの入力時間に対して無視できない作業時間に
なるという問題があった。さらに同音異義語の選択とい
う作業には，正しい漢字表記に関する知識が必要とされ
ることの他に，紛らわしい漢字表記の選別にかなり神経
を使うことになり，円滑な日本語入力を妨げるという問
題があった。When a kana-kanji conversion function is used to input a word string in kanji notation from reading a word, the input efficiency depends on how to select homonyms. The conventional kana-kanji conversion function is based on the method in which the input person selects a desired word from multiple homonyms, and the time required to select this homonym is the input time for reading hiragana or katakana. There was a problem that the working time was not negligible. Furthermore, the task of selecting homonyms requires knowledge of the correct kanji notation, and also requires a great deal of nerve to select confusing kanji notations, which hinders smooth Japanese input. there were.

【０００４】なお，先に示したとおり従来の仮名漢字変
換機能には単語の学習機能を持たせることが一般的にな
ってはいるが，この学習機能では入力者が直前に選択し
た同音異義語を機械的に記憶しているに過ぎず，様々な
日本語入力の各場面において同音異義語を正しく選択す
るには不十分である。また，文法情報等を用いた日本語
解析技術により，同音異義語の確定を自動化する試みも
なされているが，日本語解析上同義とされる名詞単語等
の同音異義語の選択については，やはり不十分であり入
力者の選択に依らざるを得ないという問題があった。As described above, it is generally known that the conventional kana-kanji conversion function has a word learning function. However, in this learning function, the homonyms selected immediately before by the input person are selected. It is not enough to correctly select the homonyms in various scenes of Japanese input, because it memorizes mechanically. Attempts have also been made to automate the determination of homonyms using Japanese analysis techniques that use grammatical information, etc. However, regarding the selection of homonyms such as noun words that are synonymous in Japanese analysis, There was a problem that it was insufficient and had to depend on the input person's selection.

【０００５】本発明は上記事情に鑑みてなされたもの
で，その目的とするところは，手書きまたは活字印刷さ
れた入力すべき原稿がある場合に，従来の技術における
上述のような問題を解消し，文字認識技術を適用して同
音異義語の選択を自動化することで，円滑な漢字表記単
語列の入力が可能な文字入力装置を提供することにあ
る。The present invention has been made in view of the above circumstances, and it is an object of the present invention to solve the above-mentioned problems in the prior art when there is a manuscript to be input which is handwritten or printed. By applying character recognition technology and automating the selection of homonyms, it is to provide a character input device that enables smooth input of kanji word strings.

【０００６】[0006]

【課題を解決するための手段】本発明の上述の目的は，
順次入力される記入文字列イメージデータに対して，文
字認識手段により１組以上の文字コードへの変換処理を
行い，記入文字列イメージデータに対応して入力される
平仮名，片仮名またはローマ字の文字コード列に対して
は，仮名漢字変換手段により１組以上の漢字表記単語列
候補への変換処理を行い，照合手段により漢字表記単語
列候補と文字コード候補との照合を行って漢字表記単語
列候補を文字コード候補に合致する度合いに応じて順序
付けを行い，入出力制御手段により文字イメージデータ
と順序付けした漢字表記単語列の表示および外部からの
指示による漢字表記単語列の選択および確定を行うこと
を特徴とする文字入力装置によって達成される。なお，
ここでいう漢字表記単語列とは，１単語だけの文字列で
もよく，また複数単語を含む文字列でもよい。The above-mentioned objects of the present invention are as follows.
A character recognition unit converts the sequentially input character string image data into one or more sets of character codes, and the character codes of hiragana, katakana, or romaji are input corresponding to the entered character string image data. For the columns, the kana-kanji conversion means performs conversion processing into one or more sets of kanji notation word string candidates, and the matching means compares the kanji notation word string candidates with the character code candidates to obtain kanji notation word string candidates. The characters are ordered according to the degree of matching with the character code candidates, the character image data is displayed by the input / output control means, and the ordered kanji written word string is selected and the kanji written word string is selected and confirmed by an external instruction. Achieved by a character input device. In addition,
The kanji notation word string referred to here may be a character string containing only one word or a character string containing a plurality of words.

【０００７】[0007]

【作用】本発明に係わる文字入力装置により，手書きま
たは活字印刷された原稿のキー入力において，原稿上の
文字形状に合致した同音異義語の自動選択が可能にな
り，キー入力者が同音異義語の選択に煩わされることの
ない円滑な日本語入力が実現可能となる。With the character input device according to the present invention, in key input of an original handwritten or printed, it is possible to automatically select a homonym that matches the character shape on the original, and the key input person can use a homonym. It is possible to realize smooth Japanese input without being bothered by selecting.

【０００８】[0008]

【実施例】以下，図面を用いて本発明の実施例を説明す
る。図１は，本発明の文字入力装置を実現する機能ブロ
ック構成図である。図１において，１０１は文字イメー
ジデータを文字コードに変換する文字認識手段，１０２
は平仮名，片仮名またはローマ字の文字列を漢字表記単
語列に変換する仮名漢字変換手段である。１０３は文字
コードと漢字表記単語列とを照合する照合手段，１０４
は処理結果の表示および漢字表記単語列の選択および確
定指示を入力する入出力制御手段である。１０５は原稿
上の文字列イメージデータ，１０６はキー入力された文
字コードデータ，１０７は文字認識結果である文字コー
ド候補，１０８は漢字表記単語列候補，１０９は照合の
結果原稿上の文字列イメージ形状に近い順に順位付けさ
れた漢字表記単語列候補，１１０は表示データ（上記の
漢字表記単語列候補および文字イメージデータ），１１
１はオペレータからのキー入力データ，１１２は最終的
な出力である漢字表記単語列を示している。Embodiments of the present invention will be described below with reference to the drawings. FIG. 1 is a functional block configuration diagram for realizing the character input device of the present invention. In FIG. 1, 101 is a character recognition means for converting character image data into a character code, and 102.
Is a kana-kanji conversion means for converting a hiragana, katakana, or roman character string into a kanji notation word string. Reference numeral 103 is a collating means for collating a character code with a Kanji written word string, 104
Is an input / output control means for inputting a display of a processing result, selection of a Kanji written word string, and a confirmation instruction. Reference numeral 105 is character string image data on the manuscript, 106 is character code data input by a key, 107 is a character code candidate as a character recognition result, 108 is a kanji written word string candidate, 109 is a collation result character string image on the manuscript Kanji writing word string candidates ranked in order of closeness to the shape, 110 is display data (above Kanji writing word string candidates and character image data), 11
Reference numeral 1 indicates key input data from the operator, and reference numeral 112 indicates a kanji written word string which is the final output.

【０００９】以下，図１に示した機能ブロック構成の動
作を簡単に説明する。まず，文字認識手段１０１では，
外部より入力される原稿上の文字列イメージデータ１０
５に対して文字認識を実行し，各文字のイメージデータ
に対応する文字コード（ＪＩＳやシフトＪＩＳコード）
を文字得点（その文字コードらしさを示す度合い）とと
もに出力する。ここで，文字列イメージデータの取得に
は，各種の手段を用いることができ，スキャナ，ＦＡＸ
を入力装置として用いる方法が一般的である。文字認識
の実行は，具体的には，文字サイズの正規化等を行う前
処理，文字の特徴を抽出する処理，抽出した文字特徴を
用いて入力文字パタンと文字辞書中の登録文字パタンと
の間の距離値（前記の文字得点に相当）を計算する処
理，最後に前記の距離値に基づき対応する文字コードへ
変換する処理の４つの過程によってなされる。なお，文
字認識手段１０１については図２により，後にその動作
を詳細に説明する。The operation of the functional block configuration shown in FIG. 1 will be briefly described below. First, in the character recognition means 101,
Character string image data on the manuscript input from outside 10
Character code is executed for 5 and character code (JIS or shift JIS code) corresponding to image data of each character
Is output together with the character score (the degree indicating the character code likelihood). Here, various means can be used to acquire the character string image data, such as a scanner and a fax.
Is generally used as an input device. Specifically, character recognition is performed by preprocessing for normalizing the character size, processing for extracting character features, and input character patterns and registered character patterns in the character dictionary using the extracted character features. It is performed by four steps of a process of calculating a distance value (corresponding to the character score) and finally a process of converting to a corresponding character code based on the distance value. The operation of the character recognition means 101 will be described later in detail with reference to FIG.

【００１０】一方，仮名漢字変換手段１０２では，入力
者のキー操作により入力される平仮名，片仮名またはロ
ーマ字の文字コードデータ１０６からなる文字列を，単
語辞書の検索により，漢字表記の単語列に変換し，これ
を出力する。ここでの変換処理では，読みは同一である
が漢字表記の異なる同音異義語については複数の候補と
して全て出力することとする。なお，仮名漢字変換手段
１０２については図３により，後にその動作を詳細に説
明する。On the other hand, the kana-kanji conversion means 102 converts a character string consisting of character code data 106 of hiragana, katakana or romaji input by a key operation of the input person into a word string in kanji notation by searching a word dictionary. And output it. In this conversion process, all homonyms with the same reading but different Kanji notation are output as a plurality of candidates. The operation of the kana-kanji conversion means 102 will be described later in detail with reference to FIG.

【００１１】次に，照合手段１０３では，文字認識結果
である文字コード候補１０７と，仮名漢字変換結果であ
る漢字表記単語列候補１０８との突き合わせを行い，双
方の結果の間での合致する度合いに応じて漢字表記単語
列の順位付けを行い，これらを候補１０９として出力す
る。なお，照合手段１０３については図４および図５に
より，後にその動作を詳細に説明する。Next, the collating means 103 matches the character code candidate 107, which is the character recognition result, with the kanji notation word string candidate 108, which is the kana-kanji conversion result, and determines the degree of matching between both results. According to the above, the kanji written word strings are ranked, and these are output as candidates 109. The operation of the matching means 103 will be described in detail later with reference to FIGS. 4 and 5.

【００１２】最後に，入出力制御手段１０４では，まず
表示データ１１０として，前記の漢字表記単語列候補１
０９を，その順位に従って文字イメージデータとともに
外部（ディスプレイ）へ出力する。平仮名，片仮名また
はローマ字の文字コードの入力者（オペレータ）は，表
示された文字列イメージデータが正しく漢字表記単語列
に変換されているか否かを確認し，第１位の漢字表記単
語が誤りである場合には，正しい単語を選択し，その結
果を入出力制御手段１０４へ伝える。入出力制御手段１
０４は，オペレータからのキー入力データ１１１による
指示で確定した漢字表記単語列１１２を出力する。Finally, in the input / output control means 104, first, as the display data 110, the above-mentioned kanji word sequence candidate 1
09 is output to the outside (display) together with the character image data according to the order. The operator (operator) of the hiragana, katakana, or romaji character code confirms whether or not the displayed character string image data is correctly converted into the kanji notation word string. In some cases, the correct word is selected and the result is transmitted to the input / output control means 104. Input / output control means 1
04 outputs the Chinese character writing word string 112 confirmed by the instruction from the operator with the key input data 111.

【００１３】次に，図２を用いて，採用可能な文字認識
手段１０１の動作例を詳細に説明する。図２中，２０１
は文字イメージデータサイズの正規化等を行う前処理
部，２０２は文字イメージデータから文字認識に用いる
特徴を抽出する特徴抽出部，２０３は上記の特徴と予め
文字情報に登録されている標準文字パタンの特徴との間
の距離を計算する距離計算部，２０４は文字辞書，２０
５は距離計算の結果，その値が比較的小さい標準パタン
に相当する文字コードを得るコード変換部を表す。ま
た，２０６は前処理後の入力文字イメージデータ，２０
７は抽出した文字特徴，２０８は文字辞書からの標準パ
タンの文字特徴，２０９は入力文字イメージデータの文
字特徴と標準パタンの文字特徴との間の距離計算結果を
表す。Next, the operation example of the character recognition means 101 which can be adopted will be described in detail with reference to FIG. 201 in FIG.
Is a preprocessing unit for normalizing the size of character image data, 202 is a feature extraction unit for extracting features used for character recognition from the character image data, and 203 is a standard character pattern registered in advance in the above features and character information. , A distance calculation unit for calculating the distance to the feature
A code conversion unit 5 obtains a character code corresponding to a standard pattern having a relatively small value as a result of distance calculation. Reference numeral 206 denotes input character image data after preprocessing, 20
Reference numeral 7 represents the extracted character feature, 208 represents the character feature of the standard pattern from the character dictionary, and 209 represents the distance calculation result between the character feature of the input character image data and the character feature of the standard pattern.

【００１４】前処理部２０１では，入力された文字イメ
ージデータに対して，主に，ノイズ除去，正規化処理を
施す。ここでノイズ除去とは，紙面の汚れ等が原因とな
って付加された微細な黒画素を除去する処理である。ま
た，正規化処理は，入力イメージデータに対する後の特
徴抽出の条件を，比較対象となる標準パタンの文字特徴
の抽出条件に合わせることを目的とした処理であり，記
入文字サイズを既定の文字サイズまで拡大または縮小す
る処理である。The preprocessing unit 201 mainly performs noise removal and normalization processing on the input character image data. Here, the noise removal is a process of removing fine black pixels added due to stains on the paper surface. In addition, the normalization process is a process aimed at matching the conditions of subsequent feature extraction for the input image data with the extraction conditions of the character features of the standard pattern to be compared. It is a process of enlarging or reducing to.

【００１５】特徴抽出部２０２は，前処理を施した入力
文字イメージデータ２０６から，文字認識に必要となる
特徴を抽出する。ここで文字特徴には各種提案されてい
るが，例えば，萩田他の「外郭方向寄与度特徴による漢
字の識別」（電子情報通信学会論文誌，Vol.J66-D, No.
10, 1983）に示されている外郭方向寄与度特徴等を利用
することが可能である。The feature extraction unit 202 extracts the features required for character recognition from the preprocessed input character image data 206. Various types of character features have been proposed here, but for example, Hagita et al., “Identification of Kanji Characters by Contribution Features in the Outer Direction” (IEICE Transactions, Vol.J66-D, No.
It is possible to use the contour direction contribution feature shown in 10, 1983).

【００１６】距離計算部２０３は，上記の動作によって
得られた入力文字イメージデータの文字特徴２０７と，
予め文字辞書２０４として登録されている標準パタンの
文字特徴２０８との間で距離計算を実行し，入力文字イ
メージデータが文字辞書２０４の中のいずれの標準パタ
ンに形状が近いかを識別する。The distance calculation unit 203 includes character features 207 of the input character image data obtained by the above operation,
The distance calculation is performed with the character feature 208 of the standard pattern registered in advance as the character dictionary 204, and which standard pattern in the character dictionary 204 the shape of the input character image data is close to is identified.

【００１７】コード変換部２０５は，距離計算結果２０
９をもとに比較的形状が近いと識別された標準パタンに
相当する文字コードを求め出力する。ここで文字コード
にはＪＩＳコードやシフトＪＩＳコード等を用いること
が可能である。The code conversion unit 205 calculates the distance calculation result 20.
Based on 9, the character code corresponding to the standard pattern identified as having a relatively close shape is obtained and output. Here, a JIS code, a shift JIS code, or the like can be used as the character code.

【００１８】次に，図３を用いて，仮名漢字変換手段１
０２の動作の詳細を説明する。図３は，入力として読み
「しりょう」が与えられた場合の動作例を示している。
図３中，３０１は後の単語辞書検索のための検索キーを
生成する検索キー生成部，３０２は単語辞書検索を実行
する単語検索部，３０３は単語辞書，３０４，３０５は
検索キーを表す。Next, referring to FIG. 3, kana-kanji conversion means 1
The operation of No. 02 will be described in detail. FIG. 3 shows an operation example in the case where the reading "shiri" is given as an input.
In FIG. 3, 301 is a search key generation unit that generates a search key for a later word dictionary search, 302 is a word search unit that executes a word dictionary search, 303 is a word dictionary, and 304 and 305 are search keys.

【００１９】検索キー生成部３０１は，入力された平仮
名または片仮名の文字コード列の先頭から末尾にかけ
て，１文字ずつ延長した単語辞書の検索キーを生成す
る。例えば，平仮名の文字コード列「しりょう」が入力
された場合には，「し」，「しり」，「しりょ」，「し
りょう」を生成する。図３では，説明を簡単にするた
め，検索キーとして「しりょう」のみが生成された場合
を示している。The search key generation unit 301 generates a search key for a word dictionary that is extended by one character from the beginning to the end of the input hiragana or katakana character code string. For example, when the hiragana character code string “shiri” is input, “shi”, “shiri”, “shiri”, and “shiri” are generated. In FIG. 3, for simplification of description, a case where only “shiri” is generated as the search key is shown.

【００２０】次に，単語検索部３０２は，前記の検索キ
ーを用いて単語辞書３０３を検索する。図３の例では，
検索キー「しりょう」から，「史料」，「死霊」，「思
量」，「試料」，「資料」，「飼料」の計６個の漢字表
記単語（同音異義語）が検索結果として得られる。な
お，図３により説明した仮名漢字変換手段１０２は，日
本語ワードプロセッサ等において一般的に採用されてい
る方法で，ここで示した以外の各種の方法が考えられる
が，いずれの方法も，本発明による文字入力装置に採用
することが可能である。Next, the word search unit 302 searches the word dictionary 303 using the search key. In the example of Figure 3,
From the search key "shiryo", a total of 6 kanji words (same synonyms) of "historical material", "dead spirit", "thinking", "sample", "material", and "feed" can be obtained as search results. . The kana-kanji conversion means 102 described with reference to FIG. 3 is a method that is generally adopted in a Japanese word processor and the like, and various methods other than those shown here are conceivable. Can be used for the character input device.

【００２１】図４は，照合手段１０３の一実施例を示す
機能ブロック構成図である。図４中，４０１は文字認識
結果である文字コードを文字得点とともに一時蓄える文
字候補テーブル，４０２は文字得点を基に漢字表記単語
（単語列）候補の得点を計算する単語得点演算部，４０
３は仮名漢字変換結果である漢字表記単語候補およびそ
の単語得点を一時蓄える単語候補テーブル，４０４は漢
字表記単語候補の得点に基づきソーティング（整列）を
行う整列部である。図４に示した機能ブロック構成によ
る照合手段１０３の詳細な動作は，次の図５を用いて説
明する。FIG. 4 is a functional block configuration diagram showing an embodiment of the collating means 103. In FIG. 4, 401 is a character candidate table that temporarily stores the character code that is the result of character recognition together with the character score, 402 is a word score calculation unit that calculates the score of the Kanji notation word (word string) candidate based on the character score, 40
Reference numeral 3 is a word candidate table for temporarily storing the kanji written word candidates as the kana-kanji conversion result and the word score thereof, and 404 is an alignment unit for performing sorting based on the score of the kanji written word candidate. The detailed operation of the matching unit 103 having the functional block configuration shown in FIG. 4 will be described with reference to FIG.

【００２２】図５は，照合手段１０３の動作を説明する
ための図面である。図５中，４０１および４０２は，図
４中に示した文字候補テーブル，単語候補テーブルであ
る。文字候補テーブル４０１には，文字認識結果である
文字コードの候補と，各文字コード候補の文字得点が蓄
えられる。図５では，文字イメージデータ「資料」の文
字認識結果として，１文字目が，「賢（９点）」，「資
（８点）」，「寛（６点）」，２文字目が，「料（１０
点）」，「科（８点）」，「村（５点）」が蓄えられて
いる例を示している。また，単語候補テーブル４０３に
は，仮名漢字変換結果である漢字表記単語候補が蓄えら
れる。図５では，「史料」，「死霊」，「思量」，「試
料」，「資料」，「飼料」の計６個の単語候補が蓄えら
れている例を示している。FIG. 5 is a drawing for explaining the operation of the collating means 103. 5, 401 and 402 are the character candidate table and the word candidate table shown in FIG. The character candidate table 401 stores character code candidates that are the result of character recognition and the character score of each character code candidate. In FIG. 5, as the character recognition result of the character image data “material”, the first character is “wise (9 points)”, “material (8 points)”, “relative (6 points)”, and the second character is "Fee (10
“Points” ”,“ Departments (8 points) ”, and“ Villages (5 points) ”are stored. Further, the word candidate table 403 stores kanji notation word candidates that are kana-kanji conversion results. FIG. 5 shows an example in which a total of six word candidates of “historical material”, “dead spirit”, “thought”, “sample”, “material”, and “feed” are stored.

【００２３】単語得点の演算においては，単語候補テー
ブル４０３中の各単語候補が，文字候補テーブル４０１
中の文字コード候補を含む場合に，その文字得点を包含
する単語候補の単語得点として加算し，単語候補テーブ
ル４０３の単語得点領域に蓄える。図５の例では，各単
語候補の単語得点は，「史料」が「料」の１０点のみ，
「死霊」，「思量」が文字コード候補を全く含まないの
で０点，「試料」が「料」の１０点のみ，「資料」が
「資」の８点と「料」の１０点で合計１８点，「飼料」
が「料」の１０点のみとなる。整列部４０４によるソー
ティング（整列）処理では，単語候補テーブル４０３中
の単語得点の多い単語候補から順に並べ換えを行う。そ
の結果，図５の例では，１位が「資料」，２位が「史
料」，「試料」，「飼料」となる。In the word score calculation, each word candidate in the word candidate table 403 is converted into a character candidate table 401.
When a character code candidate in the middle is included, it is added as a word score of a word candidate including the character score, and stored in the word score area of the word candidate table 403. In the example of FIG. 5, the word score of each word candidate is only 10 points in which “historical material” is “material”,
0 points because "dead spirit" and "thought" do not include character code candidates at all, "sample" only has 10 points of "charge", "data" has 8 points of "material" and 10 points of "charge" 18 points, "feed"
Is only 10 points of "fee". In the sorting processing by the sorting unit 404, the word candidates in the word candidate table 403 are sorted in order from the word candidate with the highest score. As a result, in the example of FIG. 5, the first place is “data”, the second place is “historical material”, “sample”, and “feed”.

【００２４】説明を簡単にするために，変換対象が１単
語である場合の例を説明したが，複数単語を含む単語列
の場合にも単語得点を加算した得点により順位付けを行
い，同様に本発明を実施できることは言うまでもない。In order to simplify the explanation, the example in which the conversion target is one word has been described. However, even in the case of a word string including a plurality of words, ranking is performed by the score obtained by adding the word scores, and similarly. It goes without saying that the present invention can be implemented.

【００２５】[0025]

【発明の効果】以上説明したように，本発明によれば，
手書きまたは活字の入力原稿上の文字列入力において，
キー入力された平仮名，片仮名またはローマ字と，原稿
上の文字列イメージに対する文字認識結果とを照合する
ことにより，原稿上の文字形状に合致した同音異義語の
自動選択が可能となり，キー入力者が同音異義語の選択
に煩わされることのない円滑な日本語入力を実現できる
という顕著な効果を奏する。As described above, according to the present invention,
Input of handwritten or printed characters When inputting a character string on the manuscript,
By collating the hiragana, katakana, or romaji entered by the key with the character recognition result for the character string image on the manuscript, it is possible to automatically select the homonyms that match the character shape on the manuscript, and the key input person This has the remarkable effect of being able to realize smooth Japanese input without being bothered by the selection of homonyms.

[Brief description of drawings]

【図１】本発明に係わる文字入力装置の一実施例を説明
するブロック構成図である。FIG. 1 is a block diagram illustrating an embodiment of a character input device according to the present invention.

【図２】文字認識手段の動作を説明する図である。FIG. 2 is a diagram illustrating an operation of a character recognition unit.

【図３】仮名漢字変換手段の動作を説明する図である。FIG. 3 is a diagram illustrating an operation of a kana-kanji conversion unit.

【図４】照合手段の一実施例を示す機能ブロック構成図
である。FIG. 4 is a functional block configuration diagram showing an embodiment of a matching unit.

【図５】照合手段の動作を説明する図である。FIG. 5 is a diagram illustrating an operation of a matching unit.

[Explanation of symbols]

１０１文字認識手段１０２仮名漢字変換手段１０３照合手段１０４入出力制御手段１０５文字列イメージデータ１０６文字コードデータ１０７文字コード候補１０８漢字表記単語列候補１０９順位付けされた漢字表記単語列候補１１０表示データ１１１キー入力データ１１２漢字表記単語列 101 Character Recognition Means 102 Kana-Kanji Conversion Means 103 Collating Means 104 Input / Output Control Means 105 Character String Image Data 106 Character Code Data 107 Character Code Candidates 108 Kanji Notation Word String Candidates 109 Ranked Kanji Notation Word String Candidates 110 Display Data 111 Key input data 112 Kanji notation word string

Claims

[Claims]

1. A character recognizing means for converting character image data into character code candidates, and a kana kanji character for converting a character code of hiragana, katakana or romaji inputted corresponding to the character image data into kanji notation word string candidates. A degree of matching between the conversion means, the character code candidates obtained by the character recognition means and the kanji notation word string candidates obtained by the kana-kanji conversion means, and matching the kanji notation word string candidates with the character code candidates The collating means for performing ordering according to the above, the character image data and the ordered kanji notation word string candidates are displayed, and a kanji notation word string is selected from the kanji notation word string candidates by an external instruction to confirm. A character input device comprising: output control means.